Lingtrain Studio
? Intro
Lingtrain Studio is the ML based app for accurate texts alignment on different languages.
- Extracts parallel corpora from two texts.
- Makes the formatted parallel book from it with sentence highlightning.
⚡ Articles
- How to create bilingual books. Part 2. Lingtrain Alignment Studio
- How to make a parallel texts for language learning. Part 1. Python and Colab version
- Lingtrain Aligner. Приложение для создания параллельных книг, которое вас удивит
- Сам себе Гутенберг. Делаем параллельные книги
? Models
Automated alignment process relies on the sentence embeddings models. Embeddings are multidimensional vectors of a special kind which are used to calculate a distance between the sentences. You can also plug your own model using the interface described in models directory. Supported languages list depend on the selected backend model.
- distiluse-base-multilingual-cased-v2
- more reliable and fast
- moderate weights size — 500MB
- supports 50+ languages
- full list of supported languages can be found in this paper
- LaBSE (Language-agnostic BERT Sentence Embedding)
- can be used for rare languages
- pretty heavy weights — 1.8GB
- supports 100+ languages
- full list of supported languages can be found here
? Running on local machine
You can run the application on your computer using docker.
-
Make sure that docker is installed by typing the
docker version
command in your console. -
Images configured to run locally are available on Docker Hub.
-
Run the following commads in your console:
docker pull lingtrain/studio:v7.2
docker run -v C:\app\data:/app/data -v C:\app\img:/app/static/img -p 80:80 lingtrain/studio:v7.2
-
App will be available in your browser on the
localhost
address. -
If you need to run the container on another port (e.g. localhost:8081):
- Change the API_URL parameter in config.js
- Rebuild the docker container
- Start it with changed -p parameter (e.g. -p 8081:80)
? Running in development mode
Clone this repo on your machine.
Backend
Flask/uwsgi backend REST API service. It’s pretty simple and contains all the alignment logic.
cd /be
python main.py
Frontend
SPA. Vue + vuex + vuetify. UI for managing alignment process using BE and a tool for translators to edit processing documents.
cd /fe
Setup
npm install
Compile and run with hot-reloads for development
npm run serve
✉️ Feedback
You can crate an issue or send me a message in telegram: @averkij
? License
This work is licensed under a Attribution-NonCommercial-NoDerivatives 4.0 International license. See LICENSE.