Quantcast
Channel: openSUSE Forums
Viewing all articles
Browse latest Browse all 40713

Parallel download of rpm packages

$
0
0
Zypper downloads packages serially, and having about a thousand packages to update a week, it gets quite boring. Also, while zypper can download packages one-by-one in advance, it can't be called concurrently. I found the libzypp-bindings project but it is discontinued. I set myself to improve the situation.

Goals:


  • Download all repositories in parallel (most often different servers);
  • Download up to MAX_PROC (=6) packages from each repository in parallel;
  • Save packages where zypper picks them up during system update: /var/cache/zypp/packages;
  • Alternatively, download to $HOME/.cache/zypp/packages;
  • Avoid external dependencies, unless necessary.


Outline:


  1. Find the list of packages to update;
  2. Find the list of repositories;
  3. For each repository:
    1. Keep up to $MAX_PROC curl processes downloading packages.

  4. Copy files to default package cache.


Results & Open Issues:


  1. Great throughput: 2,152 kb/s vs 783 kb/s;
  2. zypper list-updates doesn't give new required/recommended packages, so they are not present in cache with my routine. Any tip to get them as well?
  3. I'm not sure if wait -n returns to the same background function that dispatched a download request, or if any can capture the "wait", or all of them resume from a single process exit. This may lead to unbalanced MAX_PROC per repository, specially if a single & different process capture the wait. Does someone knows which primitive the wait is modeled after (mutex/semaphore, etc) or how it works when there's more than one wait?
  4. Also I'm not sure if I should just use the local cache or the system cache, although it's a minor issue.


Code:

#!/bin/bash

MAX_PROC=6

function repos_to_update () {
    zypper list-updates | grep '^v ' | awk -F '|' '{ print $2 }' | sort --unique | tr -d ' '
}

function packages_from_repo () {
    local repo=$1

    zypper list-updates | grep " | $repo " | awk -F '|' '{ print $6, "#", $3, "-", $5, ".", $6, ".rpm" }' | tr -d ' '
}

function repo_uri () {
    local repo=$1

    zypper repos --uri | grep " | $repo " | awk -F '|' '{ print $7 }' | tr -d ' '
}

function repo_alias () {
    local repo=$1

    zypper repos | grep " | $repo " | awk -F '|' '{ print $2 }' | tr -d ' '
}

function download_package () {
    local alias=$1
    local uri=$2
    local line=$3
    IFS=# read arch package_name <<< "$line"

    local package_uri="$uri/$arch/$package_name"
    local local_dir="$HOME/.cache/zypp/packages/$alias/$arch"
    local local_path="$local_dir/$package_name"
    printf -v y %-30s "$repo"
    printf "Repository: $y Package: $package_name\n"
    if [ ! -f "$local_path" ]; then
        mkdir -p $local_dir
        curl --silent --fail -L -o $local_path $package_uri
    fi
}

function download_repo () {
    local repo=$1

    local uri=$(repo_uri $repo)
    local alias=$(repo_alias $repo)
    local pkgs=$(packages_from_repo $repo)
    local max_proc=$MAX_PROC
    while IFS= read -r line; do
        if [ $max_proc -eq 0 ]; then
            wait -n
            ((max_proc++))
        fi
        download_package "$alias" "$uri" "$line" &
        ((max_proc--))
    done <<< "$pkgs"
}

function download_all () {
    local repos=$(repos_to_update)
    while IFS= read -r line; do
        download_repo $line &
    done <<< "$repos"
    wait
}

download_all
#sudo cp -r ~/.cache/zypp/packages/* /var/cache/zypp/packages/


Viewing all articles
Browse latest Browse all 40713

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>