Your self hosted YouTube media server
Browser Extension: Tube Archivist Companion
Project Documentation
Core functionality
- Subscribe to your favorite YouTube channels
- Download Videos
- Index and make videos searchable
- Play videos
- Keep track of viewed and unviewed videos
Problem Description
Once your YouTube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from YouTube, you can organize, search and enjoy your archived YouTube videos without hassle offline through a convenient web interface.
Latest Release:
Release tag: v0.5.0
Release date: 2025-03-09
Project Updates, Breaking changes
- There are several breaking changes, read this carefully before updating.
- If you accidentally updated without reading this, you can revert the image to
bbilly1/tubearchivist:v0.4.13
- This ship’s the new React based frontend. Big shout out to @MerlinScheurer for taking on the bulk of the work.
- Additionally this ships a major refactor of the backend code organization.
- Also shoutout to @kralverde for helping in the backend refactor.
- This is the first iteration there might be bugs.
- The compose file is also updated, see all changes here.
Migration Guide
Local db.sqlite3
Due to the backend refactor, there were changes introduced that made persiting your user accounts and schedules unfeasible. Fear not, there is a convenient migration script for export and import.
- On TA v0.4.13 backup your configuration by executing
python manage.py ta_config_backup
, e.g.docker compose exec -it tubearchivist python manage.py ta_config_backup
this will create a migration file at/cache/backup/migration.json
. - Double check that file, you should see your user(s), API token and your schedules.
- Then stop all containers
- Delete the db file from
/cache/db.sqlite3
-
Pull the new TA image and let the initial setup complete, wait until you can reach the login page
- You might have to set
REDIS_CON
here already, see bellow. - Then restore the backup by executing
python manage.py ta_config_restore
, e.g.docker compose exec -it tubearchivist python manage.py ta_config_restore
. This will restore your configurations from backup. - Login with your usual username/password and double check your API key and schedule config.
- Delete
/cache/backup/migration.json
to avoid confusion.
- You might have to set
Migrating the db.sqlite3 configuration is not strictly necessary. You can also just delete the file, and let it recreate at startup. You'll have to reconfigure:
- Any changes you made on your user like name/password
- The API key will have changed, you'll need to update that in e.g. the browser extension
- All schedules will be reset to default, you'll have to reconfigure them through the interface.
Redis
The configuration to connect TA with redis has changed. There is now a single environment variable called REDIS_CON
to tell TA where to reach redis. If you are using the defaults, set this to redis://archivist-redis:6379
. This allows for more flexibility to connect to a wide range of Redis configurations.
Additionally this project no longer depends on RedisJSON, but just on plain Redis. There is a migration step that runs at first start, you'll see ✓ migrated appconfig to ES
confirming the migration. If there is nothing to migrate you'll see no config values to migrate
meaning it's save to switch:
- Stop all containers
- From the Redis volume delete the
dump.rdb
file - Change the image from
redis/redis-stack-server
to just defaultredis
.
Note: - you don't have to change the redis image, you could use the stack image or any compatible alternative, but you have to reset the
dump.rdb
file. - this will not migrate your cookie. If you have set that, you'll have to import that again.
- this will not migrate any videos in "Continue watching", these positions will be lost.
TA_HOST protocol
If you are accessing TA behind a SSL reverse proxy, specifying the protocol is now required for the TA_HOST
environment variable, e.g. https://
. For the sake of consistency, also specify the protocol if you access TA without SSL, e.g. http://
.
Backend port overwrite
If you previously used TA_UWSGI_PORT to modify the backend port, use the better named variable: TA_BACKEND_PORT.
Cast and Static Auth
Previously the Cast intecration was enabled with the env var ENABLE_CAST
. You can now configure that in the integrations section on the config page.
There is an additional environment variable called DISABLE_STATIC_AUTH
, that disables authentication on static files, required for Cast to work.
Appsettings
This is the last step for moving the redis config to ES. At startup the appsettings will get migrated from Redis to ES. That should be seamless, but depending on what values you might have set, this can create data types conflicts. At first startup, You'll see a message like:
document_parsing_exception
andfailed to parse field [...] of type [...] in document with id 'appsettings'
- If you encounter that, you'll need to reset the appsettings index. From within the ES container run:
curl -XDELETE -u elastic:$ELASTIC_PASSWORD "localhost:9200/ta_config?pretty"
Then restart TA. A new blank config index will get created. You'll have to enter your config values again from the settings page.
Added
- Added additional sleep statements, by @ Styloy
- Added PO Token for yt-dlp
- Added user config toggle to show/hide help text
Changed
- Backend is now served with uvicorn, a slim and convenient asgi capable web server.
- Redis connection is now configured with the
REDIS_CON
environment variable for better flexibility. - Sleep interval is not automatically randomized to +/-50% from the value set.
- There are additional sleep statements set to avoid hitting rate limits.
- The App settings page got a bit rewrite, config fields are now handled individually and not in a form.
- Similar to the channel config overwrites.
- This no longer depends on the redisJSON part of
redis-server-stack
but on just defaultredis
. Simplifying things and making things less error prone for updates upstream. - All incoming and outgoing API data and parameters are now serialized and validated.
Fixed
- Fixed live URL parsing, by @ FunkeCoder23, #805
- Fixed failing channel metadata extracting with faulty fallback implementation, #795
Dev setup
- The application can now easily be run outside of the container for development. See CONTRIBUTING.md for more details.
- Linting is now done with
pre-commit
for better reproducible results over various systems and CI/CD.
Docs
- All environment variables are now documented on a dedicated page link. As these apply for all installation instructions, we can avoid duplication.
- The API docs are now generated with Swagger, they are accessible on your TA instance directly at
/api/docs/
. - Adding the swagger docs publically on the docs site, is pending...
API Changes
Only applicable if you made any API integrations. This is a list of changes to API endpoints.
On a general note:
- All data, queries and return statements are now serialized
- The swagger docs are accessible directly on your TA instance
- The return format of some endpoints have changed:
- List views with pagination return a "data" top level key with a list of objects. They also return a "paginate" top level key with the pagination object.
- List views that do not implement pagination, return the content directly in an array without a top level "data" key.
- Detail views return the object directly without a "data" key.
- List views no longer also return the appconfig object, use the dedicated endpoint instead.
Video endpoints:
| from | to | comment |
|---------------------------------|---------------------------------|---------------------------------------|
| /api/video/
Task endpoints:
| from | to | comment |
|------------------------------|---------------------------------|------------------------------|
| /api/task-name/ | /api/task/by-name/ | get all task results |
| /api/task-name/\
Settings endpoints:
| from | to | comment |
|-------------------------------|-------------------------------------------|--------------------------|
| /api/snapshot/ | /api/appsettings/snapshot/ | get all ES snapshots |
| /api/snapshot/\
User endpoints: | from | to | comment | | ------------------------|--------------------|----------------------| | /api/config/user/ | /api/user/me/ | current user details | | /api/login/ | /api/user/login/ | login user |
Converted to parameters:
| from | to | comment |
|--------------------------------------|--------------------------------------|-----------------|
| /api/playlist/\