AI in the Liminal: Between Code and Culture
I’ve worked across the boundaries of language, culture, and machine learning—first as a scholar in critical theory and later as a professional in internationalization and NLP. Somewhere along the way, I found myself explaining ‘pragmatics’ and ‘semiotics’ in data labeling interviews—and getting blank stares. That moment stuck with me.
Because it speaks to something deeper: our current approaches to training large language models often treat language as fixed, static, and universal. But language isn’t math—it’s lived, situated, and always in motion.
What if we trained LLMs to reflect that? What if dialogism—yes, Bakhtin—wasn’t just a theory from literary studies, but a blueprint for hybrid, more human approaches to AI?
The Problem: AI’s Monologic Assumption
When I browse hashtags about data annotation and generative AI, I often see posts from professionals promising to “make your AI more accurate” or to reveal the secret method that will finally get your NLP model working “just right.” These claims tend to assume that language—the very heart of generative AI—is monologic: that there’s a single, correct meaning, one truth, and one authority that allows for no disagreement.
But nothing could be further from the truth. Language is rich, layered, and fundamentally dialogic. Meaning arises in context, in relationship, in response. It’s more like a tango—a sensuous, sensory-rich dance of call and response with the music, the crowd, and your partner. Remove any one element, and the whole experience flattens. So it is with language.
Let’s say your data annotation pipeline accounts for tone and context, but annotators disagree. What do you do with that disagreement? If you’re discarding it, you’re missing something vital. That’s not noise—it’s signal. It’s gold. And if you’re letting it slip through your fingers, you’re no better than an inexperienced gold-rush miner, blind to the glint in the pan.
Dialogism as a Better Framework
I’ll admit, referencing Bakhtin in a piece about LLMs might feel a little retro—maybe even cheugy by today’s standards. But I keep returning to his work because it reminds me of something we risk forgetting: language is not a product, it’s a relationship. It’s not static data to be cleaned and frozen; it’s a living, breathing act of exchange.
Bakhtin’s concept of heteroglossia—the coexistence of multiple, often conflicting voices within a single language—challenges the idea that there is one correct, “clean” version of any utterance. Similarly, polyphony, the presence of independent voices that maintain their integrity without being collapsed into one dominant perspective, invites us to imagine language models that don’t flatten difference but listen for it. Every utterance, in Bakhtin’s view, is not a closed statement but a response—and an anticipation of future response. It carries addressivity—a built-in awareness that someone is listening, and that this context shapes meaning.
What if our models could internalize this? Not just learn from static corpora, but engage in forms of mutual attunement? Could we build systems that respond dialogically—learning not by fixing data but by relating to it, by noticing difference, contradiction, and ambiguity as essential signals rather than noise?
So much of the current momentum in AI development is fixated on scale. Bigger models, more data, more compute. But what if the problem isn’t size—it’s structure?
Instead of pouring resources into training ever-larger LLMs on frozen, contextless text, what if we focused on deepening our training practices? What if we treated language not as a static archive to be mined, but as a living network of situated meanings to be negotiated?
Bakhtin reminds us that meaning is always relational. That’s why I believe we should be exploring hybrid, iterative approaches—models trained not in isolation, but in conversation. Imagine a system of smaller LLMs that dialogue with one another, each shaped by different communities, languages, and geopolitical contexts. Their training data isn’t assumed to be neutral or singularly “correct”—it’s debated, annotated by people with diverging perspectives, and re-evaluated as context shifts.
This isn’t just an aesthetic or philosophical shift. It’s a technical proposal with ethical implications. A smaller model trained through iterative, dialogic cycles across diverse contexts might not only be more adaptable and transparent—it might be better at handling nuance, disagreement, and ambiguity than a giant model trained on a monolithic view of “truth.”
In other words, the path forward might not be upward, but outward—toward multiplicity, conversation, and cultural specificity.
Why One Size Doesn’t Fit All: A Legal Translation Parable
Having worked with legal translation across English and Spanish, I’ve often encountered a frustrating dissonance: Spanish legal texts still echo the rhetorical forms of 16th-century Castilian bureaucracy, while U.S. legal English favors stripped-down clarity and plain speech. When translation is handled by someone fluent in the language but not the legal register, the result is jarring. It reads like a text in the right language but the wrong voice—often legally inaccurate or culturally tone-deaf. This isn’t just about “accuracy”—it’s about discursive authority and situated expertise.
I suspect similar challenges arise in French and Italian, which share Latin legal roots, and I wonder how this plays out in CJK languages or across the Indo-European linguistic spectrum in India, where law is filtered through colonial inheritances and multilingual code-switching. How can a single LLM hope to “learn” legal language in such radically divergent cultural chronotopes? The answer might not be a bigger, smarter model—but rather a network of specialized, culturally informed agents that learn from one another through structured dialogue, disagreement, and contextual annotation.
Toward a Dialogic System: A Sketch for Hybrid LLM Training
What if instead of a single monolithic model trained on “the whole internet,” we pivoted toward a networked constellation of smaller, culturally-tuned models—each trained with iterative, context-aware methods and designed to learn not just from static data, but from each other?
Here’s a sketch of such a system:
1. Modular, Localized Models (Nodes)
- Each model is trained on data curated, annotated, and validated by regional experts, including translators, cultural analysts, and legal specialists.
- Models are optimized for language-register combinations, e.g., Spanish legalese from Latin America vs. Peninsular Spanish vs. legal English from the U.S. or India.
2. Dialogic Learning Layer
- Rather than enforcing a single “ground truth,” annotations allow for disagreement and variance. These tensions are not discarded—they become training features.
- Smaller models exchange interpretations of similar prompts or documents, surfacing where they agree or diverge. This enables a form of computational polyphony.
3. Hybrid Training Infrastructure
- Combines supervised learning (human-annotated data) with retrieval-augmented generation (RAG) and multi-agent debate systems.
- Each model doesn’t just “learn” on its own but can query sibling models for additional perspectives, like nodes in a conversational network.
4. Transparent Feedback Loops
- Human annotators (across regions and disciplines) review points of conflict or uncertainty.
- Rather than collapsing these into a single canonical answer, the model learns to retain contextual ambiguity where needed—e.g., tone, cultural reference, legal formality.
5. Evaluation as Dialogue
- Instead of benchmarking solely on accuracy, we assess discursive awareness: Can the model shift tone across cultures? Can it identify when a phrase carries different legal or emotional weight in Tamil, English, or Portuguese?
I’ve worked at the boundaries of language, culture, and systems—from translating centuries-old legal texts to imagining how AI might one day listen more closely. The future of generative AI isn’t just about scale—it’s about reciprocity, context, and better conversations.
I believe AI should be equitable, multilingual, and in dialogue with the world it speaks into. It should carry sabor, be infused with duende and orality, and leave space for those at the margins to declare: I am here.
If you’re building toward that kind of future, let’s talk.