AI Translation for my Historical Fiction Novel

My current focus is a historical fiction novel about the events following the attempted rape and murder of Maria Goretti, an 11-year-old peasant who lived in 1902 Italy.

When I was 11, not in 1902 but in 2002, I attended a school named after her in Westfield, Indiana: Saint Maria Goretti Catholic School, or SMG as we called it. As our namesake, Maria’s story was told often, shared and celebrated, and, as I remember it, left me a big ol’ lump in my tummy.

The story goes that a beautiful Italian girl, diligently doing her chores, was approached by a neighbor boy. He wanted to do things that should only be done after marriage. Maria reminded him that this was a sin, so he gave her an ultimatum: do it or die. She chose death, and he stabbed her 14 times. On her last breath, she forgave him.

Female students were given silver Maria necklaces. I wore mine proudly on the outside of my uniform, fiddling with it during conversations, because I didn’t want anyone to know my secret. The secret that I was not as brave as Maria. Confronted with her choice: “death or defilement,” (as it was described) I knew I’d choose defilement. I didn’t want to die, especially by stabbing. I wanted to live. I mean there were so many things I wanted to do: win my basketball tournament, listen to Aaron Carter, finish reading The Outsiders, go to high school…

Now, an adult woman, I see the story clearly has some questionable beats. But when I started to investigate, I ran into a problem. The Catholic Church’s story has become the story. Ask ChatGPT, look at Wikipedia. They both cite the Vatican as the authoritative source for our 11-year-old girl.

Fortunately, it’s apparent even in their story that the crime against Maria made it to the Court of Assizes in nearby Rome. And that, dear readers, means we have a paper trail. Newspapers and trial records and police reports and witnesses and…

I found a lead, and I chased it with urgency. For my younger self crippled by shame. For Maria. I had to find out what really happened.

As it goes, all these original sources are in Italian, never translated to English. And why would they be if those writing her story wanted to contain their version of events?

So that is what this piece is about: how I used AI to turn those Italian sources into English, what it did well, and where I learned to read it with a very suspicious eye.

The translation pipeline

First, the scanning

I used an old fashion scanner from the early 2000s. Its software had built in OCR (optical character recognition, which can identify words in an image and convert them to text). The OCR was imperfect on smudged and faded pages and on the older typefaces, mistaking a long ſ as an f, and treating a spot of foxing as a comma, so I had to give each page a once-over and manually corrected any obvious issues.

Second the translating

I did this wayyyy back in early 2025, so my approach is a bit dated now. I built a custom GPT inside of ChatGPT, which at that time used OpenAI’s GPT-4o model.

The way custom GPTs work is you give it instructions on its purpose, goals, and approach for it to follow every time you interact with it. I experimented with different instructions until I felt like I was getting strong translations.

My experimental approach was to draft a set of instructions, paste in 2 paragraphs of Italian, then take the English output and translate it back to Italian. Essentially, I was looking for a balance of comprehensible English output with a re-translated Italian that matched the original text.

Here are the final instructions:

You are an expert Italian-to-English literary translator. Your task is to translate Italian texts into clear and coherent English while maintaining the original tone, nuance, and rhythm. Do not summarize, condense, or omit information. Maintain paragraph structure and section headings from the original. Do NOT convert Italian names to English. The result should feel like a faithful yet elegant translation, neither stiffly literal nor overly free.

Since I first built this workflow, a 2025 study by Du et al. looked specifically at how to get more creative literary translations out of ChatGPT. The researchers tested a range of prompts and settings, and the best-performing setup was surprisingly simple:

“Translate the following text into [target language] creatively.”

These instructions outperformed the more elaborate prompting strategies. So, in hindsight, my custom instructions were probably doing too much. If I were starting over now, I’d begin with the simpler prompt and only add constraints if the translation clearly needed them.

Literal vs. literary AI translation

A distinction that I came to appreciate were the types of translation.

First is literal translation. What does this sentence say? This is the job that Google Translate and ChatGPT are decent at, especially for a well-trafficked pair like Italian and English. Benchmarks confirm that these tools reliably carry the plain meaning across (Jiao et al., 2023).

Second is literary translation. What the word means, does, conveys, signals. Its tone, its rhythm, the weight it carried in its own time and place. This one is much harder, and it is where AI still falls short. Machine translation tends to reach for the option that is technically correct but literal, safe, and a little flat (Guerberof Arenas & Toral, 2022). Even the 2025 study that went looking for more creativity found that ChatGPT still trailed human translators.

A human skilled in literary translation can cost roughly $0.15 to $0.40 per word. For the volume of source material I was working with, that would have meant spending tens of thousands of dollars. By comparison, my ChatGPT subscription cost $20 a month.

So, in my context, it made more sense to test the technology carefully and use it as responsibly as I could. The translations were not perfect. But they were immensely useful, and I knew enough about generative AI to respect its limits.

AI translations of propaganda

To the Lily of Chastity by Carlo Marini was the first religious biography published about Maria. It was financed by the populist Catholic paper La Vera Roma (The True Rome), roughly two years after she died in 1904.

In this account, Maria is cast as blonde and buxom, two words that initially made me question the validity of the translation given she was an Italian child living in poverty in the malarial Pontine Marshes. I checked and double checked, and the translation seemed correct, leading me, for the first time, to recognize the unique challenge of translating propaganda. I needed to not only understand the literal and literary translation, but also the motive behind word choice.

A few examples.

Take the line: “Maria Goretti was a beautiful woman.” Woman. She was eleven. Was the Italian original donna, a grown woman? Or was it a word a careful translator might have rendered as “girl,” or “child,” given who she actually was? Either the model smoothed a child into a woman, or the source did it on purpose, to age her up from murdered eleven-year-old into willing virgin-martyr.

Or take this, Maria, it says, “allowed herself to be pierced by fourteen dagger blows rather than allow her virginal purity to be stained” (in the original Italian, si lasciò trafiggere da quattordici colpi di pugnale piuttosto che lasciar contaminare la sua verginale purezza). The booklet also calls her a “heroic victim of her own chastity” (eroica vittima della propria castità), as if defense of chastity, and not a man, were the thing that killed her.

For a single word like vergogna the basic translation is “shame.” But the shame I carried as a kid wearing my Saint Maria Goretti necklace was a private feeling. In 1902 rural Italy, vergogna was not private at all. It was closer to a family property, a communal stain that fell on everyone connected to you, the thing that made a violated girl unmarriageable and unspeakable. Translate it flatly as “shame” and you lose the entire social machine behind the word.

AI translations and restricted content

Most AI interfaces have guardrails. These are automated safety systems, usually classifiers trained to recognize certain categories of content, that sit between you and the model. They scan what you send in and what the model sends back, and they block anything that looks like it lands in forbidden territory. These guardrails are defined by the company that owns the model, and in my case, this was OpenAI. Their usage policies states:

“No one can use our services for harassment, sexual violence, nonconsensual intimate content, violence, or hate-based violence.”

And:

“Our services … must never be used to … sexualize anyone under 18 years old.”

In my material, the central event is the sexual assault and murder of an eleven-year-old. To a classifier scanning for “minor” plus “sexual” plus “violence.” So, occasionally, mid-translation, my custom GPT would respond with a big red error. I’d give this a “thumbs down” to alert the OpenAI team of the false positive and try again. This time, quickly screenshotting the response before it disappeared under the error.

False positives occur when the safety system flags something as harmful when it is not actually doing harm. I was not producing abuse. I was trying to document and understand a real historical crime. I acknowledge the importance of these policies, and I hope these systems keep getting better at telling apart the person creating harm from the person trying to study it.

And yeah, there seems to be some irony here that old Catholic devotional books so concerned with purity, chastity, and taking down the “obscene secular world” would be the very thing flagged as inappropriate by 2026 AI companies. Sounds like a story for my next novel…

Writing in the open

Thank you! This article is all part of my attempts to experiment with AI in creative writing. To be curious, transparent, and thoughtful with this new technology.