Compare commits

...

No commits in common. "master" and "main" have entirely different histories.
master ... main

54 changed files with 2270 additions and 309 deletions

131
README.md
View File

@ -0,0 +1,131 @@
# greek_lang
Проект для изучения греческого языка: перевод слов между языками (греческий, русский, английский) с помощью OpenAI, сохранение в базе данных (PostgreSQL) вместе с транскрипцией, описанием, примерами, категорией и аудио‑произношением (генерация через Google TTS). Также включает телеграм‑бота (aiogram) с хранением состояния в Redis.
## Возможности
- Перевод слова между языками ru/en/el (ISO6391) через OpenAI.
- Автогенерация и сохранение:
- леммы, транскрипции (IPA), перевода, описания;
- части речи, семантической категории, примера и его перевода;
- краткой этимологии;
- аудио‑файла произношения (aiogTTS).
- Сохранение результатов в PostgreSQL (SQLAlchemy + Alembic миграции).
- Телеграм‑бот на aiogram с Redis FSMхранилищем.
- Конфигурация через .env и pydanticsettings; инициализация зависимостей через dependency_injector.
## Требования
- Python 3.13+
- PostgreSQL 14+
- Redis 6+
- OpenAI API ключ
Рекомендуется менеджер зависимостей uv и утилита go-task (для Taskfile.yml).
## Установка
1. Клонируйте репозиторий:
git clone <repo-url>
cd greek_lang
2. Установите зависимости (через uv):
- Обновить/синхронизировать зависимости: uv sync
- Обновить lockfile: uv lock --upgrade && uv sync
3. Создайте файл .env в корне проекта (см. пример ниже).
## Настройка окружения (.env)
Значения по умолчанию указаны в скобках. Переменные без значения обязательны к заполнению.
# OpenAI
API_KEY=sk-...
# Telegram Bot
TG_TOKEN=123456:ABC...
# Логи (опционально: отправка ошибок в Telegram)
LOG_TELEGRAM_BOT_TOKEN=123456:DEF... # опционально
LOG_TELEGRAM_CHAT_ID=123456789 # опционально
# PostgreSQL
DB_HOST=127.0.0.1
DB_PORT=5432
DB_NAME=greek_lang
DB_USER=greek_lang
DB_PASSWORD=greek_lang
DB_POOL_SIZE=20
DB_POOL_MAX_OVERFLOW=5
DB_CONNECT_WAIT_TIMEOUT_SECONDS=5
DB_DEBUG=false
# Redis
REDIS_HOST=127.0.0.1
REDIS_PORT=6379
REDIS_DB=0
REDIS_USERNAME=
REDIS_PASSWORD=
REDIS_POOL_SIZE=100
Примечания:
- Переменные считываются классами конфигурации из src/greek_lang/configs/*.py.
- Для Alembic миграций можно переопределить URL через переменную окружения ALEMBIC_DB_URL. Если не задано, Alembic сам соберёт URL из .env.
## Инициализация базы и миграции
Alembic уже сконфигурирован (см. src/greek_lang/database/migrations/ и alembic.ini).
- Применить миграции к последней версии:
alembic -c src/greek_lang/database/alembic.ini upgrade head
- Откатить на один шаг:
alembic -c src/greek_lang/database/alembic.ini downgrade -1
- Использовать конкретный URL БД (в обход .env):
ALEMBIC_DB_URL="postgresql://user:pass@host:5432/dbname" \
alembic -c src/greek_lang/database/alembic.ini upgrade head
## Использование
### CLI переводчик
Пример запуска CLI для перевода одного слова:
python -m cli.translate "έμπορος" -s el -t ru
- -s/--source: исходный язык (ru|en|el), по умолчанию el
- -t/--target: язык перевода (ru|en|el), по умолчанию ru
При запуске создаются контейнеры зависимостей, выполняется запрос к OpenAI, генерируется аудио‑произношение и всё сохраняется в таблицу glossary_word.
### Телеграм‑бот
Убедитесь, что Redis и .env настроены. Затем запустите бота:
python -m greek_lang.tg_bot
Бот удалит вебхук и начнёт polling. Команда /start ответит тестовым сообщением (заглушка). В коде предусмотрена интеграция Redis FSMхранилища и готовность к добавлению диалогов (aiogram-dialog).
## Архитектура (кратко)
- Контейнеры зависимостей: src/greek_lang/container.py и под‑контейнеры для конфигов, БД, OpenAI, Redis.
- Конфигурация: pydanticsettings, .env читается из корня проекта (src/greek_lang/configs/__init__.py).
- Перевод: src/greek_lang/translator.py — orchestration: OpenAI → TTS → вставка/обновление в БД (UPSERT по term).
- OpenAI: src/greek_lang/openai_manager — типизированный парсинг ответа через client.beta.chat.completions.parse в модель WordInfo.
- БД: SQLAlchemy 2.0, Alembic миграции; сущности в glossaries/models.py и openai_manager/models.py.
- Аудио: aiogTTS, генерация в BytesIO, хранение в столбце LargeBinary.
- Логи: JSONлоггер, опциональная отправка ошибок в Telegram.
## Разработка
- Линт/типы через go-task (см. Taskfile.yml):
- Запуск проверок: task check
- Только mypy: task mypy
- Ruff lint/format: task ruff-check / task ruff-fix / task ruff-format
- Обновить зависимости: task deps-update
- Очистить кэши: task clean
- Установка devзависимостей (uv):
uv sync --group dev
Примечание: В pyproject.toml определён entry point [project.scripts], но пока не реализована функция greek_lang:main. Используйте вызовы модулей через python -m как указано выше.
## Частые проблемы
- Нет доступа к OpenAI: проверьте переменную API_KEY и сетевые ограничения.
- Ошибка подключения к Postgres: проверьте DB_HOST/DB_PORT/DB_USER/DB_PASSWORD и что БД создана.
- Alembic не видит URL: либо задайте ALEMBIC_DB_URL, либо убедитесь, что .env заполнен корректно.
- Redis недоступен: проверьте REDIS_* переменные и доступность сервиса.
## Лицензия
Не указана. При необходимости добавьте раздел с выбранной лицензией.

View File

@ -35,6 +35,12 @@ tasks:
- task ruff-fix
- task ruff-format
deps-update:
desc: "Update all dependencies (uv lock --upgrade && uv sync)"
cmds:
- uv lock --upgrade
- uv sync
clean:
desc: "Clean cache and temporary files"
cmds:

0
checkers/__init__.py Normal file
View File

19
checkers/check_openai.py Normal file
View File

@ -0,0 +1,19 @@
import asyncio
from greek_lang.container import init_main_container
from greek_lang.openai_manager.manager import OpenAiManager
async def main():
async with init_main_container() as container:
open_ai_manager: OpenAiManager = await container.openai_container().ai_manager()
source_lang = "el"
target_lang = "ru"
word_response = await open_ai_manager.get_gpt_response(
word="έμπορος", source_lang=source_lang, target_lang=target_lang
)
print(word_response)
if __name__ == "__main__":
asyncio.run(main())

48
cli/translate.py Normal file
View File

@ -0,0 +1,48 @@
import asyncio
import click
from greek_lang.container import init_main_container
from greek_lang.languages import LanguageEnum
from greek_lang.translator import translate
async def _translate(word: str, source_code: str, target_code: str) -> None:
try:
source_lang = LanguageEnum(source_code.lower())
target_lang = LanguageEnum(target_code.lower())
except KeyError as exc:
raise click.BadParameter(f"Unsupported language code: {exc.args[0]}") from exc
async with init_main_container():
result = await translate(word, source_lang, target_lang=target_lang)
click.echo(result)
@click.command(help="Перевод слова между языками библиотеки greek_lang.")
@click.argument("word")
@click.option(
"-s",
"--source",
"source_code",
default="el",
show_default=True,
type=click.Choice([lang.name for lang in LanguageEnum], case_sensitive=False),
help="Код исходного языка (ISO-639-1, как в LanguageEnum).",
)
@click.option(
"-t",
"--target",
"target_code",
default="ru",
show_default=True,
type=click.Choice([lang.name for lang in LanguageEnum], case_sensitive=False),
help="Код языка перевода (ISO-639-1, как в LanguageEnum).",
)
def cli(word: str, source_code: str, target_code: str) -> None:
"""Обёртка, которая запускает асинхронный перевод через asyncio."""
asyncio.run(_translate(word, source_code, target_code))
if __name__ == "__main__":
cli()

View File

@ -8,17 +8,27 @@ authors = [
]
requires-python = ">=3.13"
dependencies = [
"aiogram>=3.21.0",
"aiogram-dialog>=2.4.0",
"aiogtts>=1.1.1",
"alembic>=1.16.1",
"asyncpg>=0.30.0",
"click>=8.2.1",
"dependency-injector>=4.47.1",
"greenlet>=3.2.3",
"legacy-cgi>=2.6.3",
"openai>=1.84.0",
"orjson>=3.11.1",
"pendulum>=3.1.0",
"psycopg2-binary>=2.9.10",
"pydantic>=2.11.5",
"pydantic-settings>=2.9.1",
"python-json-logger>=3.3.0",
"redis>=6.4.0",
"requests>=2.32.4",
"sentry-sdk>=2.34.1",
"sqlalchemy>=2.0.41",
"telebot>=0.0.5",
]
[project.scripts]
@ -36,4 +46,5 @@ dev = [
"pre-commit>=4.2.0",
"pyupgrade>=3.20.0",
"ruff>=0.11.13",
"types-requests>=2.32.4.20250809",
]

View File

View File

@ -0,0 +1,19 @@
from __future__ import annotations
import io
from aiogtts import aiogTTS # type: ignore[import-untyped]
from ..languages import LanguageEnum
async def get_pronunciation(text: str, source_lang: LanguageEnum) -> io.BytesIO:
aiogtts = aiogTTS()
buffer = io.BytesIO()
await aiogtts.write_to_fp(
text=text,
fp=buffer,
slow=True,
lang=source_lang.value,
)
return buffer

View File

@ -3,6 +3,7 @@ from dependency_injector import containers, providers
from .db_config import PostgresConfig
from .log_config import LoggerConfig
from .openai_config import OpenAiConfig
from .redis_conn import RedisConfig
from .tg_bot_config import TgBotConfig
@ -11,5 +12,6 @@ class ConfigContainer(containers.DeclarativeContainer):
postgres_config: providers.Provider[PostgresConfig] = providers.Singleton(
PostgresConfig
)
redis_config: providers.Provider[RedisConfig] = providers.Singleton(RedisConfig)
tg_bot_config: providers.Provider[TgBotConfig] = providers.Singleton(TgBotConfig)
openai_config: providers.Provider[OpenAiConfig] = providers.Singleton(OpenAiConfig)

View File

@ -1,4 +1,5 @@
import pydantic
from pydantic_settings import SettingsConfigDict
from . import EnvConfig
@ -6,3 +7,7 @@ from . import EnvConfig
class LoggerConfig(EnvConfig):
telegram_bot_token: pydantic.SecretStr | None = None
telegram_chat_id: int | None = None
model_config = SettingsConfigDict(
env_prefix="LOG_",
)

View File

@ -0,0 +1,17 @@
import pydantic
from pydantic_settings import SettingsConfigDict
from . import EnvConfig
class RedisConfig(EnvConfig):
host: str = pydantic.Field(default="127.0.0.1")
port: int = pydantic.Field(default=6379)
db: int = pydantic.Field(default=0)
username: str | None = pydantic.Field(default=None)
password: pydantic.SecretStr | None = pydantic.Field(default=None)
pool_size: int = pydantic.Field(default=100)
model_config = SettingsConfigDict(
env_prefix="REDIS_",
)

View File

@ -1,7 +1,12 @@
import pydantic
from pydantic_settings import SettingsConfigDict
from . import EnvConfig
class TgBotConfig(EnvConfig):
model_config = SettingsConfigDict(
env_prefix="TG_",
)
token: pydantic.SecretStr

View File

@ -6,6 +6,7 @@ from dependency_injector import containers, providers
from .configs.container import ConfigContainer
from .database.container import DatabaseContainer
from .openai_manager.container import OpenAiContainer
from .redis_db.container import RedisContainer
class MainContainer(containers.DeclarativeContainer):
@ -18,6 +19,9 @@ class MainContainer(containers.DeclarativeContainer):
openai_container = providers.Container(
OpenAiContainer, config_container=config_container
)
redis_container = providers.Container(
RedisContainer, config_container=config_container
)
@contextlib.asynccontextmanager

View File

@ -4,8 +4,12 @@ import types
def get_app_models_modules() -> list[types.ModuleType]:
from greek_lang.glossaries import models as glossaries_models
from greek_lang.openai_manager import models as openai_manager_models
from greek_lang.users import models as users_models
from greek_lang.srs import models as srs_models
return [
glossaries_models,
openai_manager_models,
users_models,
srs_models,
]

View File

@ -0,0 +1,36 @@
"""empty message
Revision ID: 55f95da68641
Revises: 19fc4bee7a9f
Create Date: 2025-06-21 20:51:15.097769
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "55f95da68641"
down_revision: Union[str, None] = "19fc4bee7a9f"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column("glossary_word", "audio_file")
# ### end Alembic commands ###
def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.add_column(
"glossary_word",
sa.Column("audio_file", sa.TEXT(), autoincrement=False, nullable=True),
)
# ### end Alembic commands ###

View File

@ -0,0 +1,35 @@
"""empty message
Revision ID: 78357f437f61
Revises: 55f95da68641
Create Date: 2025-06-21 20:51:29.437692
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "78357f437f61"
down_revision: Union[str, None] = "55f95da68641"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.add_column(
"glossary_word", sa.Column("audio_file", sa.LargeBinary(), nullable=True)
)
# ### end Alembic commands ###
def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column("glossary_word", "audio_file")
# ### end Alembic commands ###

View File

@ -0,0 +1,43 @@
"""empty message
Revision ID: 6b43c7ed8c78
Revises: 78357f437f61
Create Date: 2025-07-16 10:13:26.574794
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "6b43c7ed8c78"
down_revision: Union[str, None] = "78357f437f61"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.alter_column(
"openai_token_usage",
"response_fingerprint",
existing_type=sa.TEXT(),
nullable=True,
)
# ### end Alembic commands ###
def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.alter_column(
"openai_token_usage",
"response_fingerprint",
existing_type=sa.TEXT(),
nullable=False,
)
# ### end Alembic commands ###

View File

@ -0,0 +1,34 @@
"""empty message
Revision ID: d30d80dee5a3
Revises: 6b43c7ed8c78
Create Date: 2025-08-10 12:40:24.118166
"""
from typing import Sequence, Union
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "d30d80dee5a3"
down_revision: Union[str, None] = "6b43c7ed8c78"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.create_unique_constraint(
op.f("uq_glossary_word_term"), "glossary_word", ["term"]
)
# ### end Alembic commands ###
def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.drop_constraint(op.f("uq_glossary_word_term"), "glossary_word", type_="unique")
# ### end Alembic commands ###

View File

@ -0,0 +1,54 @@
"""empty message
Revision ID: 747797032526
Revises: d30d80dee5a3
Create Date: 2025-08-16 17:53:23.785592
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "747797032526"
down_revision: Union[str, None] = "d30d80dee5a3"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.create_table(
"users",
sa.Column("id", sa.BigInteger(), nullable=False),
sa.Column("is_bot", sa.Boolean(), nullable=False),
sa.Column("first_name", sa.String(), nullable=True),
sa.Column("last_name", sa.String(), nullable=True),
sa.Column("username", sa.String(), nullable=True),
sa.Column("language_code", sa.String(length=8), nullable=True),
sa.Column("is_premium", sa.Boolean(), nullable=True),
sa.Column("added_to_attachment_menu", sa.Boolean(), nullable=True),
sa.Column(
"registered_at",
sa.DateTime(timezone=True),
server_default=sa.text("now()"),
nullable=False,
),
sa.PrimaryKeyConstraint("id", name=op.f("pk_users")),
)
op.create_index(op.f("ix_users_id"), "users", ["id"], unique=False)
op.create_index(op.f("ix_users_username"), "users", ["username"], unique=False)
# ### end Alembic commands ###
def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.drop_index(op.f("ix_users_username"), table_name="users")
op.drop_index(op.f("ix_users_id"), table_name="users")
op.drop_table("users")
# ### end Alembic commands ###

View File

@ -0,0 +1,112 @@
"""empty message
Revision ID: 9a2898513cf2
Revises: 747797032526
Create Date: 2025-08-16 19:40:06.376743
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "9a2898513cf2"
down_revision: Union[str, None] = "747797032526"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.create_table(
"srs_progress",
sa.Column("id", sa.BigInteger(), nullable=False),
sa.Column("user_id", sa.BigInteger(), nullable=False),
sa.Column("word_id", sa.BigInteger(), nullable=False),
sa.Column(
"due_at",
sa.DateTime(timezone=True),
server_default=sa.text("now()"),
nullable=False,
),
sa.Column("interval_days", sa.Integer(), nullable=False),
sa.Column("ease", sa.Float(), nullable=False),
sa.Column("reps", sa.Integer(), nullable=False),
sa.Column("lrn_step", sa.Integer(), nullable=False),
sa.Column(
"state",
sa.Enum(
"learning", "review", "lapsed", name="reviewstate", native_enum=False
),
nullable=False,
),
sa.ForeignKeyConstraint(
["user_id"], ["users.id"], name=op.f("fk_srs_progress_user_id_users")
),
sa.ForeignKeyConstraint(
["word_id"],
["glossary_word.id"],
name=op.f("fk_srs_progress_word_id_glossary_word"),
),
sa.PrimaryKeyConstraint("id", name=op.f("pk_srs_progress")),
sa.UniqueConstraint("user_id", "word_id", name="uq_srs_user_word"),
)
op.create_index(
op.f("ix_srs_progress_due_at"), "srs_progress", ["due_at"], unique=False
)
op.create_index(
op.f("ix_srs_progress_user_id"), "srs_progress", ["user_id"], unique=False
)
op.create_index(
op.f("ix_srs_progress_word_id"), "srs_progress", ["word_id"], unique=False
)
op.create_table(
"srs_review_log",
sa.Column("id", sa.BigInteger(), nullable=False),
sa.Column("user_id", sa.BigInteger(), nullable=False),
sa.Column("word_id", sa.BigInteger(), nullable=False),
sa.Column(
"ts",
sa.DateTime(timezone=True),
server_default=sa.text("now()"),
nullable=False,
),
sa.Column("grade", sa.Integer(), nullable=False),
sa.Column("prev_interval", sa.Integer(), nullable=False),
sa.Column("new_interval", sa.Integer(), nullable=False),
sa.Column("prev_ease", sa.Float(), nullable=False),
sa.Column("new_ease", sa.Float(), nullable=False),
sa.ForeignKeyConstraint(
["user_id"], ["users.id"], name=op.f("fk_srs_review_log_user_id_users")
),
sa.ForeignKeyConstraint(
["word_id"],
["glossary_word.id"],
name=op.f("fk_srs_review_log_word_id_glossary_word"),
),
sa.PrimaryKeyConstraint("id", name=op.f("pk_srs_review_log")),
)
op.create_index(
op.f("ix_srs_review_log_user_id"), "srs_review_log", ["user_id"], unique=False
)
op.create_index(
op.f("ix_srs_review_log_word_id"), "srs_review_log", ["word_id"], unique=False
)
# ### end Alembic commands ###
def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.drop_index(op.f("ix_srs_review_log_word_id"), table_name="srs_review_log")
op.drop_index(op.f("ix_srs_review_log_user_id"), table_name="srs_review_log")
op.drop_table("srs_review_log")
op.drop_index(op.f("ix_srs_progress_word_id"), table_name="srs_progress")
op.drop_index(op.f("ix_srs_progress_user_id"), table_name="srs_progress")
op.drop_index(op.f("ix_srs_progress_due_at"), table_name="srs_progress")
op.drop_table("srs_progress")
# ### end Alembic commands ###

View File

@ -3,7 +3,7 @@ from __future__ import annotations
import datetime
import enum
from sqlalchemy import BigInteger, Text, DateTime, Enum, func
from sqlalchemy import BigInteger, Text, DateTime, Enum, func, LargeBinary
from sqlalchemy.dialects.postgresql import ARRAY
from sqlalchemy.orm import Mapped, mapped_column
@ -11,7 +11,8 @@ from ..database.base import Base
from ..languages import LanguageEnum
class LexicalCategoryEnum(str, enum.Enum):
@enum.unique
class LexicalCategoryEnum(enum.StrEnum):
noun = "noun"
verb = "verb"
adjective = "adjective"
@ -37,6 +38,7 @@ class GlossaryWord(Base):
term: Mapped[str] = mapped_column(
Text(),
nullable=False,
unique=True,
)
language: Mapped[LanguageEnum] = mapped_column(
Enum(LanguageEnum, native_enum=False),
@ -75,8 +77,8 @@ class GlossaryWord(Base):
Text(),
nullable=True,
)
audio_file: Mapped[str | None] = mapped_column(
Text(),
audio_file: Mapped[bytes | None] = mapped_column(
LargeBinary(),
nullable=True,
)
created_at: Mapped[datetime.datetime] = mapped_column(
@ -97,7 +99,7 @@ class GlossaryWord(Base):
def __repr__(self) -> str:
return (
f"<GlossaryWord(id={self.id}, term='{self.term}', language='{self.language.value}', "
f"<GlossaryWord(id={self.id}, term='{self.term}', language='{self.language}', "
f"translation='{self.translation}', transcription='{self.transcription}', "
f"lexical_category='{self.lexical_category}', meaning_category='{self.meaning_category}')>"
)

View File

@ -2,7 +2,7 @@ import enum
@enum.unique
class LanguageEnum(str, enum.Enum):
class LanguageEnum(enum.StrEnum):
ru = "ru"
en = "en"
el = "el"

255
src/greek_lang/logger.py Normal file
View File

@ -0,0 +1,255 @@
import base64
import contextlib
import contextvars
import datetime
import logging
import socket
import sys
import typing
from collections.abc import Iterator
from logging.config import dictConfig
from types import TracebackType
import orjson
import sentry_sdk
from .configs.log_config import LoggerConfig
extra_log_context: contextvars.ContextVar[dict[str, str]] = contextvars.ContextVar(
"extra_log_context"
)
@contextlib.contextmanager
def extra_log_context_manager(new_context: dict[str, str]) -> Iterator[None]:
extra_log_context_data = {}
with contextlib.suppress(LookupError):
extra_log_context_data = extra_log_context.get()
extra_log_context_data.update(new_context)
token = extra_log_context.set(extra_log_context_data)
try:
yield
finally:
extra_log_context.reset(token)
class NonLoggableExceptionsFilter(logging.Filter):
exclude_exception_types: typing.Sequence[type[Exception]]
def __init__(
self,
*,
exclude_exception_types: typing.Sequence[type[Exception]] = (),
name: str = "",
):
self.exclude_exception_types = exclude_exception_types
super().__init__(name=name)
def filter(self, record: logging.LogRecord) -> bool:
if record.exc_info is None:
return True
try:
exception_type = record.exc_info[0]
except TypeError:
return True
return exception_type not in self.exclude_exception_types
def default_json_serializer(obj: object) -> str:
match obj:
case bytes() as b:
try:
return b.decode("utf-8")
except UnicodeDecodeError:
return base64.b64encode(b).decode("ascii")
case datetime.timedelta() as td:
return str(td.total_seconds())
case datetime.datetime() as dt:
return dt.isoformat()
case datetime.date() as d:
return d.isoformat()
case _:
raise TypeError(f"Type {type(obj)} not serializable")
def json_serializer(data: dict[str, typing.Any], **_: typing.Any) -> str:
extra_log_context_data = {}
with contextlib.suppress(LookupError):
extra_log_context_data = extra_log_context.get()
data.update({"extra_log_context": extra_log_context_data})
return orjson.dumps(
data,
default=default_json_serializer,
).decode()
def get_dict_config(
*,
sentry_dsn: str | None = None,
tg_token: str | None = None,
tg_chat: int | None = None,
exclude_exception_types: typing.Sequence[type[Exception]] = (),
formatters_extension_dict: dict[str, typing.Any] | None = None,
filters_extension_dict: dict[str, typing.Any] | None = None,
handlers_extension_dict: dict[str, typing.Any] | None = None,
loggers_extension_dict: dict[str, typing.Any] | None = None,
) -> dict[str, typing.Any]:
hostname: str = socket.gethostname()
null_handler: dict[str, str] = {
"class": "logging.NullHandler",
}
formatters = {
"verbose": {
"format": f"%(asctime)s [%(levelname)s] [{hostname} %(name)s:%(lineno)s] %(message)s"
},
"json": {
"()": "pythonjsonlogger.jsonlogger.JsonFormatter",
"json_serializer": json_serializer,
"format": "%(asctime)s %(levelname)s %(name)s %(filename)s %(lineno)s %(message)s",
},
} | (formatters_extension_dict or {})
filters = {
"non_loggable_exceptions": {
"()": NonLoggableExceptionsFilter,
"exclude_exception_types": exclude_exception_types,
},
} | (filters_extension_dict or {})
handlers = {
"console_handler": {
"class": "logging.StreamHandler",
"formatter": "verbose",
"filters": [],
},
"telegram_handler": {
"class": "greek_lang.utils.telegram_log.handler.TelegramHandler",
"token": tg_token,
"chat_id": tg_chat,
"logger_name": "console_handler",
"level": "ERROR",
"formatter": "verbose",
"filters": ["non_loggable_exceptions"],
}
if tg_token and tg_token
else null_handler,
"sentry_handler": {
"class": "sentry_sdk.integrations.logging.EventHandler",
"level": "ERROR",
"formatter": "verbose",
"filters": ["non_loggable_exceptions"],
}
if sentry_dsn is not None
else null_handler,
} | (handlers_extension_dict or {})
loggers = {
"root": {
"level": "DEBUG",
"handlers": ["console_handler", "telegram_handler", "sentry_handler"],
},
"console": {
"level": "DEBUG",
"handlers": ["console_handler"],
"propagate": False,
},
"telegram.bot": {
"propagate": False,
},
"httpx": {
"level": "DEBUG",
"propagate": True,
},
} | (loggers_extension_dict or {})
return {
"version": 1,
"disable_existing_loggers": False,
"formatters": formatters,
"filters": filters,
"handlers": handlers,
"loggers": loggers,
}
def create_tg_info_logger(
*,
tg_token: str,
tg_chat: str,
) -> logging.Logger:
logger_name = "tg_info"
dict_config = {
"version": 1,
"disable_existing_loggers": False,
"handlers": {
"telegram_handler": {
"class": "petuh_bot.utils.telegram_log.handler.TelegramHandler",
"logger_name": "console",
"token": tg_token,
"chat_id": tg_chat,
"level": "INFO",
}
},
"loggers": {
logger_name: {
"handlers": ["telegram_handler"],
"propagate": False,
},
},
}
logging.config.dictConfig(dict_config)
return logging.getLogger(logger_name)
def init_root_logger(
sentry_dsn: str | None = None,
tg_token: str | None = None,
tg_chat: int | None = None,
exclude_exception_types: typing.Sequence[type[Exception]] = (),
formatters_extension_dict: dict[str, typing.Any] | None = None,
filters_extension_dict: dict[str, typing.Any] | None = None,
handlers_extension_dict: dict[str, typing.Any] | None = None,
loggers_extension_dict: dict[str, typing.Any] | None = None,
) -> logging.Logger:
if sentry_dsn is not None:
sentry_sdk.init(
dsn=sentry_dsn,
traces_sample_rate=1.0,
default_integrations=True,
)
dict_config = get_dict_config(
sentry_dsn=sentry_dsn,
tg_token=tg_token,
tg_chat=tg_chat,
exclude_exception_types=exclude_exception_types,
formatters_extension_dict=formatters_extension_dict,
filters_extension_dict=filters_extension_dict,
handlers_extension_dict=handlers_extension_dict,
loggers_extension_dict=loggers_extension_dict,
)
dictConfig(dict_config)
return logging.getLogger()
loggers_ext: dict[str, typing.Any] = {}
def _exc_hook_patched(
exc_type: type[BaseException],
exc_val: BaseException,
exc_tb: TracebackType,
) -> None:
if isinstance(exc_val, KeyboardInterrupt):
return
logging.critical(
f"Uncaught exception: {exc_type}", exc_info=(exc_type, exc_val, exc_tb)
)
def setup() -> None:
config = LoggerConfig()
init_root_logger(
tg_token=config.telegram_bot_token.get_secret_value()
if config.telegram_bot_token
else None,
tg_chat=config.telegram_chat_id,
loggers_extension_dict=loggers_ext,
)
sys.excepthook = _exc_hook_patched # type: ignore[assignment]

View File

@ -5,15 +5,23 @@ import dataclasses
import pydantic
from openai import AsyncOpenAI
from greek_lang.languages import LanguageEnum
from greek_lang.glossaries.models import LexicalCategoryEnum
class WordInfo(pydantic.BaseModel):
lemma: str = pydantic.Field(
...,
description="lemma (base form) - for verbs, use the 1st person singular in present indicative, "
"for nouns and adjectives, use the nominative singular masculine (for adjectives)",
)
transcription: str = pydantic.Field(
...,
description="phonetic transcription in IPA",
description="lemma phonetic transcription in IPA",
)
translation: str = pydantic.Field(
...,
description="translation in {target_language}",
description="lemma translation in {target_language}",
)
description: str = pydantic.Field(
...,
@ -21,19 +29,19 @@ class WordInfo(pydantic.BaseModel):
)
part_of_speech: str = pydantic.Field(
...,
description="part of speech in {target_language}",
description=f"part of speech, one of {[cat.value for cat in LexicalCategoryEnum]}",
)
example: str = pydantic.Field(
...,
description="example",
description="lemma example",
)
example_transcription: str = pydantic.Field(
...,
description="phonetic transcription in IPA of an example",
description="lemma phonetic transcription in IPA of an example",
)
example_translation: str = pydantic.Field(
...,
description="translation of the example in {target_language}",
description="lemma translation of the example in {target_language}",
)
category: str = pydantic.Field(
...,
@ -53,8 +61,8 @@ class OpenAiManager:
self,
*,
word: str,
source_lang: str,
target_lang: str,
source_lang: LanguageEnum,
target_lang: LanguageEnum,
model: str = "gpt-4o",
) -> WordInfo:
system_message = {
@ -63,7 +71,7 @@ class OpenAiManager:
}
user_message = {
"role": "user",
"content": f'Provide detailed information about the word "{word}" in language {source_lang}, set {{target_language}} = {target_lang}.',
"content": f'Provide detailed information about the word "{word}" in language {source_lang!s}, set {{target_language}} = {target_lang!s}.',
}
response = await self.client.beta.chat.completions.parse(
model=model,

View File

@ -40,7 +40,7 @@ class OpenAiTokenUsage(Base):
)
response_fingerprint: Mapped[str] = mapped_column(
Text(),
nullable=False,
nullable=True,
index=True,
)
completion_tokens: Mapped[int] = mapped_column(

View File

View File

@ -0,0 +1,24 @@
from collections.abc import AsyncIterator
from dependency_injector import containers, providers
from ..configs.redis_conn import RedisConfig
from .redis_conn import create_redis_pool, RedisPool
async def create_redis_pool_resource(
redis_config: RedisConfig,
) -> AsyncIterator[RedisPool]:
redis_pool = await create_redis_pool(redis_config)
try:
yield redis_pool
finally:
await redis_pool.aclose()
class RedisContainer(containers.DeclarativeContainer):
config_container = providers.DependenciesContainer()
redis_pool: providers.Resource[RedisPool] = providers.Resource(
create_redis_pool_resource,
redis_config=config_container.redis_config,
)

View File

@ -0,0 +1,51 @@
import dataclasses
from typing import TypeAlias
import redis
from ..configs.redis_conn import RedisConfig
RedisPool: TypeAlias = redis.asyncio.Redis
@dataclasses.dataclass(frozen=True)
class RedisConnectionParams:
host: str = "127.0.0.1"
port: int = 6379
db: int = 0
username: str | None = None
password: str | None = None
max_connections: int = 2**31
socket_timeout: float = 5.0
def create_redis_single_pool(
redis_conn_params: RedisConnectionParams,
) -> redis.asyncio.Redis:
redis_url = f"redis://{redis_conn_params.host}:{redis_conn_params.port}/{redis_conn_params.db}"
connection: redis.asyncio.Redis = redis.asyncio.from_url( # type: ignore[no-untyped-call]
redis_url,
username=redis_conn_params.username,
password=redis_conn_params.password,
decode_responses=False,
socket_connect_timeout=redis_conn_params.socket_timeout,
max_connections=redis_conn_params.max_connections,
)
return connection
def create_redis_pool(
redis_config: RedisConfig,
) -> RedisPool:
redis_conn_params = RedisConnectionParams(
host=redis_config.host,
port=redis_config.port,
db=redis_config.db,
username=redis_config.username,
password=(
redis_config.password.get_secret_value() if redis_config.password else None
),
max_connections=redis_config.pool_size,
)
return create_redis_single_pool(redis_conn_params)

View File

View File

@ -0,0 +1,24 @@
from datetime import datetime, timedelta
from .models import UserWordProgress, ReviewState
def sm2_update(p: UserWordProgress, grade: int, now: datetime) -> None:
# grade: 0..5
e = p.ease + (0.1 - (5 - grade) * (0.08 + (5 - grade) * 0.02))
p.ease = max(1.3, e)
if grade < 3:
p.interval_days = 1
p.reps = 0
p.lrn_step = 0
p.state = ReviewState.lapsed
else:
if p.reps == 0:
p.interval_days = 1
elif p.reps == 1:
p.interval_days = 6
else:
p.interval_days = round(p.interval_days * p.ease)
p.reps += 1
p.state = ReviewState.review
p.due_at = now + timedelta(days=p.interval_days)

View File

@ -0,0 +1,67 @@
import enum
import datetime as dt
from sqlalchemy import (
BigInteger,
Integer,
Float,
DateTime,
Enum,
ForeignKey,
UniqueConstraint,
func,
)
from sqlalchemy.orm import Mapped, mapped_column
from greek_lang.database.base import Base
@enum.unique
class ReviewState(enum.StrEnum):
learning = "learning"
review = "review"
lapsed = "lapsed"
class UserWordProgress(Base):
__tablename__ = "srs_progress"
id: Mapped[int] = mapped_column(BigInteger, primary_key=True)
user_id: Mapped[int] = mapped_column(
ForeignKey("users.id"), index=True, nullable=False
)
word_id: Mapped[int] = mapped_column(
ForeignKey("glossary_word.id"), index=True, nullable=False
)
due_at: Mapped[dt.datetime] = mapped_column(
DateTime(timezone=True), index=True, server_default=func.now(), nullable=False
)
interval_days: Mapped[int] = mapped_column(Integer, nullable=False, default=1)
ease: Mapped[float] = mapped_column(Float, nullable=False, default=2.5)
reps: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
lrn_step: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
state: Mapped[ReviewState] = mapped_column(
Enum(ReviewState, native_enum=False),
nullable=False,
default=ReviewState.learning,
)
__table_args__ = (UniqueConstraint("user_id", "word_id", name="uq_srs_user_word"),)
class ReviewLog(Base):
__tablename__ = "srs_review_log"
id: Mapped[int] = mapped_column(BigInteger, primary_key=True)
user_id: Mapped[int] = mapped_column(
ForeignKey("users.id"), index=True, nullable=False
)
word_id: Mapped[int] = mapped_column(
ForeignKey("glossary_word.id"), index=True, nullable=False
)
ts: Mapped[dt.datetime] = mapped_column(
DateTime(timezone=True), server_default=func.now(), nullable=False
)
grade: Mapped[int] = mapped_column(Integer, nullable=False) # 0..5
prev_interval: Mapped[int] = mapped_column(Integer, nullable=False)
new_interval: Mapped[int] = mapped_column(Integer, nullable=False)
prev_ease: Mapped[float] = mapped_column(Float, nullable=False)
new_ease: Mapped[float] = mapped_column(Float, nullable=False)

View File

@ -0,0 +1,28 @@
from collections.abc import Sequence
from sqlalchemy import select, func
from sqlalchemy.ext.asyncio import AsyncSession
from greek_lang.glossaries.models import GlossaryWord
from .models import UserWordProgress
async def pick_due_words(
db: AsyncSession,
user_id: int,
limit: int = 10,
) -> Sequence[tuple[GlossaryWord, UserWordProgress]]:
stmt = (
select(GlossaryWord, UserWordProgress)
.join(UserWordProgress, UserWordProgress.word_id == GlossaryWord.id)
.where(
UserWordProgress.user_id == user_id,
UserWordProgress.due_at <= func.now(),
)
.order_by(UserWordProgress.due_at)
.limit(limit)
)
result = await db.execute(stmt)
return result.tuples().all()

View File

View File

@ -0,0 +1,17 @@
import asyncio
from greek_lang import logger
from greek_lang.container import init_main_container
from greek_lang.tg_bot import app
async def main() -> None:
logger.setup()
async with init_main_container():
bot = app.create_bot()
dispatcher = await app.create_dispatcher()
await app.run_bot(bot=bot, dispatcher=dispatcher)
if __name__ == "__main__":
asyncio.run(main())

View File

@ -0,0 +1,69 @@
import pydantic
from aiogram import Bot, Dispatcher, BaseMiddleware
from aiogram.fsm.storage.base import BaseStorage, DefaultKeyBuilder
from aiogram.fsm.storage.redis import RedisStorage
from aiogram.types import BotCommandScopeAllPrivateChats
from aiogram_dialog import setup_dialogs
from dependency_injector.wiring import Provide, inject
from ..configs.container import ConfigContainer
from ..redis_db.container import RedisContainer
from ..redis_db.redis_conn import RedisPool
@inject
def create_bot(
bot_token: pydantic.SecretStr = Provide[
ConfigContainer.tg_bot_config.provided.token
],
) -> Bot:
bot = Bot(
token=bot_token.get_secret_value(),
)
return bot
async def create_dispatcher() -> Dispatcher:
from .router import router as root_router
from .dialogs import dialog
fsm_storage = await create_fsm_storage()
dp = Dispatcher(
storage=fsm_storage,
)
middlewares: list[BaseMiddleware] = []
for middleware in middlewares:
dp.update.middleware(middleware)
dp.include_routers(dialog, root_router)
setup_dialogs(dp)
return dp
@inject
async def create_fsm_storage(
redis_pool: RedisPool = Provide[RedisContainer.redis_pool],
) -> BaseStorage:
storage = RedisStorage(
redis=redis_pool,
key_builder=DefaultKeyBuilder(
prefix="fsm",
with_destiny=True,
),
)
return storage
async def run_bot(
bot: Bot,
dispatcher: Dispatcher,
) -> None:
await bot.delete_webhook(drop_pending_updates=True)
await bot.delete_my_commands(scope=BotCommandScopeAllPrivateChats())
await dispatcher.start_polling(
bot, allowed_updates=dispatcher.resolve_used_update_types()
)

View File

@ -0,0 +1,19 @@
from aiogram import Router
from aiogram.filters import CommandStart
from aiogram.types import Message
from aiogram_dialog import DialogManager, StartMode
from greek_lang.tg_bot.dialogs.states import States
from greek_lang.users.manager import get_or_create_telegram_user
router = Router()
@router.message(CommandStart())
async def start(message: Message, dialog_manager: DialogManager) -> None:
user = message.from_user
if user is None:
return
await get_or_create_telegram_user(user)
await dialog_manager.start(States.main_menu, mode=StartMode.RESET_STACK)

View File

@ -0,0 +1,11 @@
from aiogram_dialog import Dialog
from .main_menu import windows as main_windows
from .add_word import windows as add_word_windows
dialog = Dialog(
main_windows.main_window,
add_word_windows.add_word_window,
add_word_windows.add_word_result_window,
)

View File

@ -0,0 +1,65 @@
from aiogram.types import Message, CallbackQuery, BufferedInputFile
from aiogram_dialog import DialogManager
from aiogram_dialog.widgets.input import MessageInput
from aiogram_dialog.widgets.kbd import Button
from greek_lang.languages import LanguageEnum
from greek_lang.translator import translate
from ..states import States
async def add_word(
message: Message,
source: MessageInput | Button,
manager: DialogManager,
) -> None:
if not message.text:
return
word = message.text.strip()
if not word:
return
source_lang = LanguageEnum.ru
target_lang = LanguageEnum.el
glossary_word = await translate(word, source_lang, target_lang=target_lang)
# Try to send audio pronunciation back to the user
try:
audio_bytes = getattr(glossary_word, "audio_file", None)
if audio_bytes:
# aiogTTS produces MP3 data; send as audio
caption = (
f"<b>{glossary_word.term}</b> → <b>{glossary_word.translation}</b>"
)
input_file = BufferedInputFile(
audio_bytes, filename=f"{glossary_word.term}.mp3"
)
await message.answer_audio(
audio=input_file, caption=caption, parse_mode="HTML"
)
except Exception:
# Silently ignore audio sending issues to not break the flow
pass
# Store data for the result window
manager.dialog_data.update(
{
"term": glossary_word.term,
"translation": glossary_word.translation,
"transcription": glossary_word.transcription or "",
"lexical_category": getattr(glossary_word, "lexical_category", ""),
"description": glossary_word.description or "",
"example": glossary_word.example or "",
"note": glossary_word.note or "",
}
)
# Switch to the result window state
await manager.switch_to(States.add_word_result)
async def on_add_another(
callback: CallbackQuery,
button: Button,
manager: DialogManager,
) -> None:
await callback.answer()
await manager.switch_to(States.add_word)

View File

@ -0,0 +1,73 @@
from typing import Any
from aiogram.enums import ParseMode
from aiogram_dialog import DialogManager, Window
from aiogram_dialog.widgets.input import MessageInput
from aiogram_dialog.widgets.kbd import Button, Row
from aiogram_dialog.widgets.markup.reply_keyboard import ReplyKeyboardFactory
from aiogram_dialog.widgets.text import Format, Const
from ..states import States
from ..base_handlers import cancel_handler
from . import handlers
async def add_word_window_getter(
dialog_manager: DialogManager, **kwargs: Any
) -> dict[str, Any]:
return {}
add_word_window = Window(
Const("<b>Введите слово</b>:"),
MessageInput(func=handlers.add_word),
Row(
Button(
Format("❌ Отмена"),
on_click=cancel_handler, # type: ignore[arg-type]
id="cancel_add_word",
),
),
markup_factory=ReplyKeyboardFactory(
input_field_placeholder=Format("Слово..."),
resize_keyboard=True,
one_time_keyboard=True,
),
state=States.add_word,
parse_mode=ParseMode.HTML,
getter=add_word_window_getter,
)
async def add_word_result_getter(
dialog_manager: DialogManager, **kwargs: Any
) -> dict[str, Any]:
# Expose dialog_data fields directly for Format widgets
return dialog_manager.dialog_data
add_word_result_window = Window(
Format(
"✅ <b>Слово добавлено</b>!\n\n"
"<b>{term}</b> → <b>{translation}</b>\n"
"{transcription}\n"
"<b>Часть речи</b>: {lexical_category!s}\n"
"<b>Описание</b>: {description}\n"
"<b>Пример</b>: {example}"
),
Row(
Button(
Const(" Добавить ещё"),
id="add_another",
on_click=handlers.on_add_another,
),
Button(
Const("🏠 В меню"),
on_click=cancel_handler, # type: ignore[arg-type]
id="to_main_menu",
),
),
state=States.add_word_result,
parse_mode=ParseMode.HTML,
getter=add_word_result_getter,
)

View File

@ -0,0 +1,16 @@
from aiogram_dialog import DialogManager, ShowMode
from aiogram_dialog.api.internal import ReplyCallbackQuery
from aiogram_dialog.widgets.kbd import Cancel
from .states import States
async def cancel_handler(
callback_query: ReplyCallbackQuery,
button: Cancel,
manager: DialogManager,
) -> None:
await manager.switch_to(
States.main_menu,
show_mode=ShowMode.DELETE_AND_SEND,
)

View File

@ -0,0 +1,26 @@
from aiogram.types import CallbackQuery
from aiogram_dialog import DialogManager, ShowMode
from aiogram_dialog.api.internal import ReplyCallbackQuery
from aiogram_dialog.widgets.kbd import Cancel, Button
from ..states import States
async def on_add_word(
callback: CallbackQuery,
button: Button,
manager: DialogManager,
) -> None:
await callback.answer()
await manager.switch_to(States.add_word)
async def cancel_handler(
callback_query: ReplyCallbackQuery,
button: Cancel,
manager: DialogManager,
) -> None:
await manager.switch_to(
States.add_word,
show_mode=ShowMode.DELETE_AND_SEND,
)

View File

@ -0,0 +1,32 @@
from typing import Any
from aiogram.enums import ParseMode
from aiogram_dialog import DialogManager, Window
from aiogram_dialog.widgets.kbd import Row, Button
from aiogram_dialog.widgets.text import Format, Const
from ..states import States
from . import handlers
async def main_getter(dialog_manager: DialogManager, **kwargs: Any) -> dict[str, Any]:
return {}
main_window = Window(
Format(
"<b>Выбери действие</b>:",
when=lambda data, widget, dialog_manager: data["dialog_data"].get("action")
is None,
),
Row(
Button(
Const("Добавить слово"),
id="add_word",
on_click=handlers.on_add_word,
),
),
state=States.main_menu,
getter=main_getter,
parse_mode=ParseMode.HTML,
)

View File

@ -0,0 +1,7 @@
from aiogram.fsm.state import State, StatesGroup
class States(StatesGroup):
main_menu = State()
add_word = State()
add_word_result = State()

View File

@ -0,0 +1,8 @@
from aiogram import Router
from .commands import router as commands_router
router = Router()
router.include_routers(
commands_router,
)

View File

@ -0,0 +1,73 @@
from dependency_injector.wiring import inject, Provide
from sqlalchemy.ext.asyncio import async_sessionmaker, AsyncSession
from sqlalchemy import func
from sqlalchemy.dialects.postgresql import insert
from greek_lang.audio.manager import get_pronunciation
from greek_lang.database.container import DatabaseContainer
from greek_lang.languages import LanguageEnum
from greek_lang.openai_manager.container import OpenAiContainer
from greek_lang.openai_manager.manager import OpenAiManager
from greek_lang.glossaries.models import GlossaryWord, LexicalCategoryEnum
@inject
async def translate(
word: str,
source_lang: LanguageEnum,
target_lang: LanguageEnum = LanguageEnum.ru,
note: str | None = None,
tags: tuple[str, ...] = tuple(),
open_ai_manager: OpenAiManager = Provide[OpenAiContainer.ai_manager],
db_session_maker: async_sessionmaker[AsyncSession] = Provide[
DatabaseContainer.async_session_maker,
],
) -> GlossaryWord:
word_response = await open_ai_manager.get_gpt_response(
word=word,
source_lang=source_lang,
target_lang=target_lang,
)
pronon = await get_pronunciation(text=word_response.lemma, source_lang=source_lang)
async with db_session_maker() as db_session, db_session.begin():
values = {
"term": word_response.lemma,
"language": source_lang,
"transcription": word_response.transcription,
"translation": word_response.translation,
"description": word_response.description,
"lexical_category": LexicalCategoryEnum(word_response.part_of_speech),
"meaning_category": word_response.category,
"example": f"{word_response.example}({word_response.example_translation})",
"etymology": word_response.etymology,
"note": note,
"tags": list(tags),
"audio_file": pronon.getvalue(),
}
stmt = (
insert(GlossaryWord)
.values(**values)
.on_conflict_do_update(
index_elements=[GlossaryWord.term],
set_={
"term": values["term"],
"language": values["language"],
"transcription": values["transcription"],
"translation": values["translation"],
"description": values["description"],
"lexical_category": values["lexical_category"],
"meaning_category": values["meaning_category"],
"example": values["example"],
"etymology": values["etymology"],
"note": values["note"],
"tags": values["tags"],
"audio_file": values["audio_file"],
"updated_at": func.now(),
},
)
.returning(GlossaryWord)
)
result = await db_session.execute(stmt)
glossary_word = result.scalar_one()
return glossary_word

View File

View File

@ -0,0 +1,39 @@
from aiogram.types import User
from dependency_injector.wiring import Provide, inject
from sqlalchemy.exc import IntegrityError
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
from greek_lang.database.container import DatabaseContainer
from greek_lang.users.models import User as TgUser
@inject
async def get_or_create_telegram_user(
user: User,
db_session_maker: async_sessionmaker[AsyncSession] = Provide[
DatabaseContainer.async_session_maker
],
) -> TgUser:
async with db_session_maker() as db_session, db_session.begin():
telegram_user: TgUser | None = await db_session.get(TgUser, user.id)
if telegram_user:
return telegram_user
try:
async with db_session_maker() as db_session, db_session.begin():
telegram_user = TgUser(
id=user.id,
username=user.username,
first_name=user.first_name,
last_name=user.last_name,
language_code=user.language_code,
is_bot=user.is_bot,
is_premium=user.is_premium,
added_to_attachment_menu=user.added_to_attachment_menu,
)
db_session.add(telegram_user)
return telegram_user
except IntegrityError:
telegram_user = await db_session.get(TgUser, user.id)
if telegram_user is None:
raise Exception(f"Can't find telegram_user = {user.id}") from None
return telegram_user

View File

@ -0,0 +1,26 @@
from datetime import datetime
from sqlalchemy import BigInteger, Boolean, DateTime, String, func
from sqlalchemy.orm import Mapped, mapped_column
from greek_lang.database.base import Base
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(BigInteger, primary_key=True, index=True)
is_bot: Mapped[bool] = mapped_column(Boolean, nullable=False)
first_name: Mapped[str | None] = mapped_column(String, nullable=True)
last_name: Mapped[str | None] = mapped_column(String, nullable=True)
username: Mapped[str | None] = mapped_column(String, nullable=True, index=True)
language_code: Mapped[str | None] = mapped_column(String(length=8), nullable=True)
is_premium: Mapped[bool | None] = mapped_column(Boolean, nullable=True)
added_to_attachment_menu: Mapped[bool | None] = mapped_column(
Boolean, nullable=True
)
registered_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
server_default=func.now(),
nullable=False,
)

View File

View File

@ -0,0 +1,103 @@
import contextlib
import copy
import datetime
import io
import logging
import logging.config
import types
import cgitb # type: ignore[import-untyped]
import telebot
from requests import ReadTimeout
ExcInfoType = tuple[type[BaseException], BaseException, types.TracebackType]
class TelegramHandler(logging.Handler):
bot: telebot.TeleBot
chat_id: int
logger_name: logging.Logger | None
def __init__(self, *, token: str, chat_id: int, logger_name: str | None = None):
logging.Handler.__init__(self)
self.bot = telebot.TeleBot(token)
self.chat_id = chat_id
self.logger = logging.getLogger(logger_name) if logger_name else None
@staticmethod
def get_tb_data(exc_info: ExcInfoType, output_format: str = "html") -> io.BytesIO:
string_io_buffer = io.StringIO()
context_width = 11
cgitb.Hook(
context=context_width,
file=string_io_buffer,
format=output_format,
).handle(info=exc_info)
string_io_buffer.seek(0)
encoding = "utf-8"
bytes_io_buffer = io.BytesIO(string_io_buffer.read().encode(encoding))
bytes_io_buffer.seek(0)
return bytes_io_buffer
@staticmethod
def prepare(log_data: str, length: int) -> str:
message = log_data[:length]
return message
def emit(self, record: logging.LogRecord) -> None:
try:
if record.exc_info is None:
self.send_plain_text(record)
else:
self.send_traceback(record)
except ReadTimeout:
if self.logger:
self.logger.error("Telegram request timed out")
except BaseException as exc:
if self.logger:
self.logger.exception(
f"Telegram Log Handler Unexpected Exception Occurred: {exc}"
)
def send_traceback(self, record: logging.LogRecord) -> None:
tb_data_html = self.get_tb_data(record.exc_info, output_format="html") # type: ignore
tb_data_plain = self.get_tb_data(record.exc_info, output_format="plain") # type: ignore
with contextlib.closing(tb_data_html), contextlib.closing(tb_data_plain):
filename = datetime.datetime.now().strftime("python_tb_%Y-%m-%d_%H_%M_%S")
caption = self.get_exc_caption_text(record)
self.bot.send_media_group(
chat_id=self.chat_id,
media=[
telebot.types.InputMediaDocument(
telebot.types.InputFile(
tb_data_html, file_name=filename + ".html"
),
caption=caption,
),
telebot.types.InputMediaDocument(
telebot.types.InputFile(
tb_data_plain, file_name=filename + ".txt"
)
),
],
timeout=5,
)
def get_exc_caption_text(self, record: logging.LogRecord) -> str:
caption_length = 200
no_exc_record = self.get_no_exc_record_copy(record)
caption = self.prepare(self.format(no_exc_record), caption_length)
return caption
@staticmethod
def get_no_exc_record_copy(record: logging.LogRecord) -> logging.LogRecord:
no_exc_record = copy.copy(record)
no_exc_record.exc_info = None
no_exc_record.exc_text = None
return no_exc_record
def send_plain_text(self, record: logging.LogRecord) -> None:
message_length = 4096
text = self.prepare(self.format(record), message_length)
self.bot.send_message(self.chat_id, text, timeout=5)

919
uv.lock

File diff suppressed because it is too large Load Diff