Who is Uddit and what does he specialize in?

Uddit is a top-rated AI Engineer specializing in RAG (Retrieval Augmented Generation) systems, MCP Servers, Patent AI Tools, and Agentic AI. He is a NIT Jaipur graduate with 4+ production enterprise systems deployed, including ShipSarthi, ZAMMERNOW, RichieAI, and Gantavyam.

What AI services does Uddit offer?

Uddit offers RAG system development, MCP server implementation, Patent AI automation, vector database solutions, LLM integration, and full-stack AI application development. He works with technologies like LangChain, LlamaIndex, Pinecone, ChromaDB, GPT-4, and Claude.

Is Uddit available for remote/freelance work?

Yes, Uddit is available for freelance projects worldwide. He works remotely with clients across all time zones and can typically start on new projects within 1-2 weeks.

How can I hire Uddit for an AI project?

You can contact Uddit via email at udditalerts247@gmail.com or schedule a call through his Cal.com booking page. You can also reach out via LinkedIn at linkedin.com/in/lorduddit-.

What production systems has Uddit built?

Uddit has built 4+ enterprise production systems including: ShipSarthi (PAN-India logistics platform with 500+ clients), ZAMMERNOW (quick commerce fashion platform), RichieAI (AI financial planning SaaS), Gantavyam (women safety ride-booking), and InventIP (patent AI automation).

The Engineering Chronicle

CHUNKING STRATEGIES

The Art and Science of Document Segmentation for AI Systems

Research & Implementation•2024 - 2025•AI Engineering

"How documents are split determines how well AI systems understand them. Chunking is the invisible foundation upon which all retrieval quality rests—get it wrong, and no amount of sophisticated modeling can compensate."

WHY CHUNKING MATTERS

Embedding models have context limits. A document of 50,000 tokens cannot be embedded as a single unit. It must be divided—but how? This seemingly simple question determines the success or failure of retrieval-augmented systems.

Poor chunking creates orphaned context. A chunk that says "as mentioned above" without including what was mentioned becomes meaningless. A table split mid-row loses its data relationships. A legal clause severed from its definitions becomes uninterpretable.

The embedding vector represents the semantic meaning of the chunk. If the chunk itself lacks coherent meaning, the vector becomes noise—degrading retrieval precision and polluting LLM context windows with irrelevant or misleading content.

CHUNKING APPROACHES

Fixed-size chunking splits text every N tokens regardless of content. Simple to implement, but semantically naive. It serves as a baseline but rarely as a production solution.

Recursive character splitting improves by respecting paragraph and sentence boundaries. LangChain's implementation handles most general content adequately, but struggles with structured documents.

Semantic chunking uses embedding similarity to detect topic boundaries. When consecutive sentences diverge semantically beyond a threshold, a new chunk begins. This preserves topical coherence but requires embedding computation during ingestion.

Document-aware chunking understands structure. Headers define sections. Lists stay unified. Tables preserve row-column relationships. Code blocks maintain syntactic integrity. This requires format-specific parsers but delivers superior results.

OVERLAP STRATEGIES

Chunk overlap creates redundancy—intentionally. When consecutive chunks share boundary content, context that might otherwise be severed is preserved in at least one chunk.

The overlap percentage balances retrieval quality against storage costs. Too little overlap risks losing critical context. Too much creates redundant embeddings that inflate vector stores and can cause duplicate retrieval results.

Empirical testing across document types reveals optimal ranges. Technical documentation benefits from 15-20% overlap. Narrative content works well at 10%. Legal text often requires 25% or higher due to dense cross-referencing.

Adaptive overlap adjusts based on content density. Sections with many internal references get higher overlap automatically. The system learns document patterns during ingestion.

IMPLEMENTATION SPECIFICATIONS

Recursive Splitter Config

• Chunk Size: 512-1024 tokens (domain-dependent)
• Overlap: 10-25% of chunk size
• Separators: ["\n\n", "\n", ". ", " "]
• Length Function: tiktoken for accuracy

Semantic Splitter Config

• Embedding Model: sentence-transformers
• Similarity Threshold: 0.75-0.85
• Window Size: 3 sentences
• Min Chunk: 100 tokens

Document-Aware Parsing

• PDF: PyMuPDF with layout analysis
• HTML: BeautifulSoup + structure extraction
• Markdown: Custom header-aware splitter
• Code: AST-based function boundaries

Quality Metrics

• Coherence Score: GPT-4 evaluation
• Retrieval Precision: Ground truth testing
• Chunk Size Distribution: Variance analysis
• Overlap Effectiveness: Context coverage

STRATEGY COMPARISON

Strategy	Best For	Limitations	Compute Cost
Fixed-Size	Homogeneous text, baseline testing	Ignores semantic boundaries	Minimal
Recursive	General documents, articles	Misses topic shifts within paragraphs	Low
Semantic	Topic-diverse content, research papers	Embedding cost, variable chunk sizes	Medium
Document-Aware	Structured docs, legal, technical	Format-specific implementation	Medium-High

PATENT DOCUMENTS

Patent claims have strict structural requirements. Independent claims must stay whole. Dependent claims need references to their parents preserved. The custom chunker identifies claim boundaries using regex patterns and maintains hierarchical relationships in metadata.

Result: 94% retrieval precision on claim-specific queries, compared to 67% with recursive splitting. The difference directly impacts patent search quality and office action response accuracy.

FINANCIAL REPORTS

Financial documents mix narrative text, tables, and figures. Standard chunking destroys table structure. The document-aware approach extracts tables as markdown, preserving row-column relationships while enabling semantic search over numerical data.

Result: Queries like "Q3 revenue growth" correctly retrieve the relevant table rows, with values intact and properly attributed. This was impossible with naive chunking approaches.

"Chunking is where information retrieval meets information preservation. The best chunk is one that could stand alone as a coherent unit of knowledge."

— RAG Systems Design Principles