Lingtrain Studio is the ML based app for accurate texts alignment on different languages.
- Extracts parallel corpora from two texts.
- Makes the formatted parallel book from it with sentence highlightning.
- How to create bilingual books. Part 2. Lingtrain Alignment Studio
- How to make a parallel texts for language learning. Part 1. Python and Colab version
- Lingtrain Aligner. Приложение для создания параллельных книг, которое вас удивит
- Сам себе Гутенберг. Делаем параллельные книги
Automated alignment process relies on the sentence embeddings models. Embeddings are multidimensional vectors of a special kind which are used to calculate a distance between the sentences. You can also plug your own model using the interface described in models directory. Supported languages list depend on the selected backend model.
- more reliable and fast
- moderate weights size — 500MB
- supports 50+ languages
- full list of supported languages can be found in this paper
- LaBSE (Language-agnostic BERT Sentence Embedding)
- can be used for rare languages
- pretty heavy weights — 1.8GB
- supports 100+ languages
- full list of supported languages can be found here
💻 Running on local machine
You can run the application on your computer using docker.
Make sure that docker is installed by typing the
docker versioncommand in your console.
Images configured to run locally are available on Docker Hub.
Run the following commads in your console:
docker pull lingtrain/studio:v7.2
docker run -v C:\app\data:/app/data -v C:\app\img:/app/static/img -p 80:80 lingtrain/studio:v7.2
App will be available in your browser on the
If you need to run the container on another port (e.g. localhost:8081):
- Change the API_URL parameter in config.js
- Rebuild the docker container
- Start it with changed -p parameter (e.g. -p 8081:80)
🔨 Running in development mode
Clone this repo on your machine.
Flask/uwsgi backend REST API service. It’s pretty simple and contains all the alignment logic.
SPA. Vue + vuex + vuetify. UI for managing alignment process using BE and a tool for translators to edit processing documents.
Compile and run with hot-reloads for development
npm run serve
You can crate an issue or send me a message in telegram: @averkij
This work is licensed under a Attribution-NonCommercial-NoDerivatives 4.0 International license. See LICENSE.