About

From raw text to linguistic insights, in a few clicks.

SA7BY® is an NLP lab for tabular corpora: upload Excel or CSV, choose the column that holds your text, and run tokenization, lemmas, POS, and dependencies in the background, then explore semantic analysis with WOLF and FastText.

Your corpus, structured and explorable

Import a CSV, Excel, XML, or PDF file. Within minutes, every word is annotated, every sentence is in context. Explore co-occurrences, n-grams, word proximity, and semantic relations. Export your results to CSV at any time.

Built for researchers in linguistics, textometry, and digital humanities.

Key features

Smart import

CSV, Excel, or PDF. Preview columns or pages, choose your source, and start processing.

Full lexicon

All words in your corpus sorted by frequency or alphabetically. Multi-POS filter, interactive word cloud, and CSV export.

Concordances

Every occurrence of a word with its left and right context. KWIC sort (L1, L2, R1, R2), configurable context size, and reading mode.

Metadata filters

Filter your corpus by author, date, genre, or any column from the original file. Multi-select, cascading filters, persisted across pages.

Co-occurrences

Discover words that frequently appear together. Five statistical measures and configurable span to analyze lexical associations.

N-grams

Identify recurring expressions and word sequences, from 2 to 5 words. Toggle between lemma and surface form.

Proximity search

Find sentences where two words appear near each other. Control distance, order, and filter by grammatical category.

Named Entities

Automatically detect persons, locations, and organizations in your corpus. Filter by type and view sentences.

Pattern Search

Build word sequences with POS, lemmas, and gaps. Find all linguistic patterns in your corpus.

Manual Annotation

Add your own columns to the concordance table to classify each occurrence. Free text or dropdown, included in CSV export.

Keywords

Discover words specific to your corpus compared to general French. Keyness score and effect size.

How it works

1

Upload

Drop your file, choose the text column or PDF pages. Linguistic processing starts automatically.

2

Explore

Browse the lexicon, concordances, n-grams, and co-occurrences. Filter by grammatical category and export to CSV.

3

Analyze

Search for nearby words, explore lexical associations, and run semantic analysis with an interactive graph.

Questions? Check the FAQ or the documentation.