Videos#

Download videos from various platforms.

Vedi anche

  • A collection of scripts I have written and/or adapted that I currently use on my systems as automated tasks [1]

  • youtube-dl - ArchWiki [2]

  • GitHub - yt-dlp/yt-dlp: A youtube-dl fork with additional features and fixes [3]

  1. install the dependencies

    apt-get install python3-yaml aria2 yt-dlp/bullseye-backports
    

    Nota

    If you use the provided options file as-is you need to install aria2

    Nota

    You need to enable the backports before installing yt-dlp

  2. install fpyutils. See reference

  3. add the user to the jobs group

    usermod -aG jobs myuser
    
  4. create the jobs directories. See reference

    mkdir -p /home/jobs/{scripts,services}/by-user/myuser
    chmod 700 /home/jobs/{scripts,services}/by-user/myuser
    
  5. create the script

    /home/jobs/scripts/by-user/myuser/youtube_dl.py#
    #!/usr/bin/env python3
    #
    # youtube_dl.py
    #
    # Copyright (C) 2019-2022 Franco Masotti (franco \D\o\T masotti {-A-T-} tutanota \D\o\T com)
    #
    # This program is free software: you can redistribute it and/or modify
    # it under the terms of the GNU General Public License as published by
    # the Free Software Foundation, either version 3 of the License, or
    # (at your option) any later version.
    #
    # This program is distributed in the hope that it will be useful,
    # but WITHOUT ANY WARRANTY; without even the implied warranty of
    # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    # GNU General Public License for more details.
    #
    # You should have received a copy of the GNU General Public License
    # along with this program.  If not, see <https://www.gnu.org/licenses/>.
    r"""Download videos, delete old ones and get a notification."""
    
    import datetime
    import pathlib
    import shlex
    import sys
    
    import fpyutils
    import yaml
    
    if __name__ == '__main__':
        configuration_file = shlex.quote(sys.argv[1])
        config = yaml.load(open(configuration_file), Loader=yaml.SafeLoader)
    
        pathlib.Path(shlex.quote(config['youtube_dl']['dst_dir'])).mkdir(
            mode=0o700, parents=True, exist_ok=True)
        pathlib.Path(shlex.quote(
            config['youtube_dl']['archived_list_file'])).touch(mode=0o700,
                                                               exist_ok=True)
        original_files = sum(1 for line in open(
            shlex.quote(config['youtube_dl']['archived_list_file'])))
    
        # Build binary.
        cmd = ''
        if 'binary' in config['youtube_dl']:
            for c in config['youtube_dl']['binary']:
                cmd += shlex.quote(c) + ' '
    
        # youtube-dl might not return 0 even if some videos are correctly downloaded.
        command = 'pushd ' + shlex.quote(
            config['youtube_dl']['dst_dir']
        ) + ' ; ' + cmd + ' --verbose --config-location ' + shlex.quote(
            config['youtube_dl']['options_file']) + ' --batch ' + shlex.quote(
                config['youtube_dl']
                ['url_list_file']) + ' --download-archive ' + shlex.quote(
                    config['youtube_dl']['archived_list_file']) + ' ; popd'
        fpyutils.shell.execute_command_live_output(command)
    
        # For this to work be sure to set the no-mtime option in the options file.
        # Only the video files, infact, would retain the original modification time
        # (not the modification time correpsponding to the download time).
        # All the other files such as thumbnails and subtitles do not retain the
        # original mtime. For this reason it is simpler not to consider the
        # original mtime.
        deleted_files = 0
        if config['delete']['enabled']:
            for f in sorted(
                    pathlib.Path(shlex.quote(
                        config['youtube_dl']['dst_dir'])).glob('*/*')):
                if f.is_file() and (datetime.date.today() -
                                    datetime.date.fromtimestamp(f.stat().st_mtime)
                                    ).days > config['delete']['days_to_keep']:
                    f.unlink()
                    deleted_files += 1
    
        final_files = sum(1 for line in open(
            shlex.quote(config['youtube_dl']['archived_list_file'])))
    
        downloaded_files = final_files - original_files
    
        message = 'DW: ' + str(downloaded_files) + ' ; RM: ' + str(deleted_files)
        if config['notify']['gotify']['enabled']:
            m = config['notify']['gotify']['message'] + '\n' + message
            fpyutils.notify.send_gotify_message(
                config['notify']['gotify']['url'],
                config['notify']['gotify']['token'], m,
                config['notify']['gotify']['title'],
                config['notify']['gotify']['priority'])
        if config['notify']['email']['enabled']:
            fpyutils.notify.send_email(message,
                                       config['notify']['email']['smtp_server'],
                                       config['notify']['email']['port'],
                                       config['notify']['email']['sender'],
                                       config['notify']['email']['user'],
                                       config['notify']['email']['password'],
                                       config['notify']['email']['receiver'],
                                       config['notify']['email']['subject'])
    
  6. create a configuration file

    /home/jobs/scripts/by-user/myuser/youtube_dl.some_subject.yaml#
    #
    # youtube_dl.some_subject.yaml
    #
    # Copyright (C) 2019-2022 Franco Masotti (franco \D\o\T masotti {-A-T-} tutanota \D\o\T com)
    #
    # This program is free software: you can redistribute it and/or modify
    # it under the terms of the GNU General Public License as published by
    # the Free Software Foundation, either version 3 of the License, or
    # (at your option) any later version.
    #
    # This program is distributed in the hope that it will be useful,
    # but WITHOUT ANY WARRANTY; without even the implied warranty of
    # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    # GNU General Public License for more details.
    #
    # You should have received a copy of the GNU General Public License
    # along with this program.  If not, see <https://www.gnu.org/licenses/>.
    
    youtube_dl:
        # A list of strings corresponding to binaries and options.
        # You need to specify at least the youtube-dl binary.
        binary:
            # - torsocks
            # - --isolate
            - /home/myuser/.local/bin/yt-dlp
    
        # The configuration file for youtube-dl.
        options_file: '/home/jobs/scripts/by-user/myuser/youtube_dl.some_subject.options'
    
        # The base directory where all the videos will lie.
        dst_dir: '/home/myuser/videos/bot/some_subject'
    
        # The base directory where all the videos will lie.
        url_list_file: '/home/jobs/scripts/by-user/myuser/youtube_dl.some_subject.txt'
    
        # The file containing the list of downloaded urls.
        archived_list_file: '/home/myuser/videos/bot/some_subject/archived.txt'
    
    delete:
        enabled: true
        days_to_keep: 60
    
    notify:
        email:
            enabled: false
            smtp_server: 'smtp.gmail.com'
            port: 465
            sender: 'myusername@gmail.com'
            user: 'myusername'
            password: 'my awesome password'
            receiver: 'myusername@gmail.com'
            subject: 'video bot: some_subject'
        gotify:
            enabled: false
            url: '<gotify url>'
            token: '<app token>'
            title: 'video bot'
            message: 'some_subject'
            priority: 5
    
  7. create the options file which select all the options used by yt-dlp

    /home/jobs/scripts/by-user/myuser/youtube_dl.some_subject.options#
    #
    # youtube_dl.some_subject.options
    #
    # Copyright (C) 2019 Franco Masotti (franco \D\o\T masotti {-A-T-} tutanota \D\o\T com)
    #
    # This program is free software: you can redistribute it and/or modify
    # it under the terms of the GNU General Public License as published by
    # the Free Software Foundation, either version 3 of the License, or
    # (at your option) any later version.
    #
    # This program is distributed in the hope that it will be useful,
    # but WITHOUT ANY WARRANTY; without even the implied warranty of
    # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    # GNU General Public License for more details.
    #
    # You should have received a copy of the GNU General Public License
    # along with this program.  If not, see <https://www.gnu.org/licenses/>.
    
    # --get-id
    
    # Filter by video release date or by playlist id number,
    # up to playlist-end videos back.
    #--dateafter now-1days
    --playlist-start 1
    --playlist-end 5
    
    --add-metadata
    
    # Download data in parallel. Change or remove the max-overall-download-limit option.
    --external-downloader aria2c
    --external-downloader-args '-c -j 3 -x 3 -s 3 -k 1M --max-overall-download-limit=512K'
    
    # Avoid geographical restrictions.
    --geo-bypass
    
    --force-ipv4
    --no-color
    --ignore-errors
    --continue
    --no-cache-dir
    
    --socket-timeout 300
    
    # Prefer 720p videos and fallback to 480p.
    --format 247+bestaudio/136+bestaudio/244+bestaudio/135+bestaudio
    
    # Subtitles.
    --write-sub
    --write-auto-sub
    
    # Output file should be a mkv container.
    --merge-output-format mkv
    
    # Get the video thumbnail.
    --write-thumbnail
    
    # File path.
    # Subdirectories by yyyy-mm.
    --output "%(uploader)s/%(upload_date>%Y)s/%(upload_date>%m)s/%(upload_date)s_%(playlist_index)s_%(title)s_%(id)s"
    
    --write-description
    --write-info-json
    #--write-annotations
    --convert-subs vtt
    
    # Save the list of the previuosly downloaded videos.
    --download-archive archive.txt
    
    # Slugify file names.
    --restrict-filenames
    
    --no-overwrite
    --no-call-home
    --prefer-free-formats
    --fixup detect_or_warn
    --prefer-ffmpeg
    
    # Transform to hvec, 8-bit color.
    #--postprocessor-args "-c:v libx265 -preset veryfast -x265-params crf=28 -pix_fmt yuv420p -c:a copy"
    
    # Sleep a number of seconds between one download and the other.
    --sleep-interval 10
    
    # Use the following url list.
    --batch channels.txt
    
    # Very important if you enable automatic file deletion.
    --no-mtime
    
    # Proxy. Enable and edit this option as needed.
    # --proxy 127.0.0.1:8123
    
  8. create a text file containing URLs of channels or playlists, one for each line

    /home/jobs/scripts/by-user/myuser/youtube_dl.some_subject.txt#
    http://a-newline
    http://separated-list
    http://of-urls.
    
  9. create the Systemd service unit file

    /home/jobs/services/by-user/myuser/youtube-dl.some_subject.service#
    [Unit]
    Description=Download youtube-dl.some_subject videos
    Requires=network-online.target
    After=network-online.target
    
    [Service]
    Type=simple
    ExecStart=/home/jobs/scripts/by-user/myuser/youtube_dl.py /home/jobs/scripts/by-user/myuser/youtube_dl.some_subject.yaml
    User=myuser
    Group=myuser
    
    [Install]
    WantedBy=multi-user.target
    
  10. create the Systemd service timer unit file

    /home/jobs/services/by-user/myuser/youtube-dl.some_subject.timer#
    [Unit]
    Description=Once a day download some_subject videos
    
    [Timer]
    OnCalendar=*-*-* 3:30:00
    Persistent=true
    
    [Install]
    WantedBy=timers.target
    
  11. fix the permissions

    chown -R myuser:myuser /home/jobs/{scripts,services}/by-user/myuser
    chmod 700 -R /home/jobs/{scripts,services}/by-user/myuser
    
  12. run the deploy script

Footnotes