BinAssist · RAG

Knowledge augmentation and document-assisted workflows.

RAG Tab Reference

The RAG (Retrieval-Augmented Generation) tab manages external documentation that can be used to enhance LLM responses with relevant context.

RAG Tab Interface

Purpose

RAG allows you to import documentation relevant to your analysis, such as:

  • API documentation for libraries used by the binary
  • Protocol specifications
  • Vendor documentation
  • Research papers
  • Previous analysis notes

When RAG is enabled in the Explain or Query tabs, relevant portions of your indexed documents are included in the LLM prompt, improving response quality.

How RAG Works

  1. You import documents into BinAssist
  2. Documents are split into chunks and indexed
  3. When you ask a question with RAG enabled, BinAssist searches for relevant chunks
  4. Relevant chunks are included in the LLM prompt
  5. The LLM uses this context to provide better answers

UI Elements

Document Table

Lists all imported documents:

ColumnDescription
NameDocument filename
SizeFile size
ChunksNumber of indexed chunks

Select a document to enable the Refresh and Delete buttons.

Management Buttons

ButtonDescription
Add DocumentImport a new document
RefreshRe-index the selected document
DeleteRemove the selected document from the index

Search Section

Test the RAG index by searching for content:

  • Search Input: Enter keywords or phrases
  • Search Mode: Select the search algorithm
  • Results: Display matching chunks with relevance scores

Statistics Panel

Shows index statistics:

StatisticDescription
DocumentsNumber of indexed documents
ChunksTotal chunks across all documents
EmbeddingsVector embeddings count

Index Management

  • Clear Index: Delete all documents and reset the RAG index

Supported Document Types

BinAssist can index the following document types:

TypeExtensionNotes
Text.txtPlain text files
Markdown.mdMarkdown formatted documents
PDF.pdfPDF documents (future)

Search Modes

When searching or retrieving context, BinAssist supports three search modes:

ModeDescriptionBest For
HybridCombines text and vector searchGeneral use (recommended)
TextKeyword-based full-text searchExact term matching
VectorSemantic similarity searchConceptual queries

Adding Documents

  1. Click Add Document
  2. Select a file from your system
  3. Wait for indexing to complete
  4. The document appears in the table

Documents are automatically chunked for efficient retrieval. Large documents may take a moment to process.

Using RAG in Analysis

To use RAG context in your analysis:

  1. Import relevant documents in the RAG tab
  2. Switch to the Explain or Query tab
  3. Enable the RAG checkbox
  4. Perform your analysis or ask your question
  5. The LLM receives relevant document context automatically

Best Practices

Document Selection

Import documents that are directly relevant to your analysis:

  • API documentation for libraries the binary uses
  • Protocol specs for network protocols in use
  • Vendor manuals for the software being analyzed
  • Research papers on relevant techniques or malware families

Document Size

  • Smaller, focused documents work better than large general references
  • Consider splitting very large documents into focused sections
  • Remove irrelevant sections before importing

Refreshing Documents

Use Refresh when:

  • The source document has been updated
  • You want to re-chunk with different settings
  • The index seems to be returning poor results

Index Storage

The RAG index is stored in the BinAssist data directory (see Settings Tab for the path). It persists across sessions, so you don't need to re-import documents each time.