Zypper downloads packages serially, and having about a thousand packages to update a week, it gets quite boring. Also, while zypper can download packages one-by-one in advance, it can't be called concurrently. I found the libzypp-bindings project but it is discontinued. I set myself to improve the situation.
Goals:
Outline:
Results & Open Issues:
Goals:
- Download all repositories in parallel (most often different servers);
- Download up to MAX_PROC (=6) packages from each repository in parallel;
- Save packages where zypper picks them up during system update: /var/cache/zypp/packages;
- Alternatively, download to $HOME/.cache/zypp/packages;
- Avoid external dependencies, unless necessary.
Outline:
- Find the list of packages to update;
- Find the list of repositories;
- For each repository:
- Keep up to $MAX_PROC curl processes downloading packages.
- Copy files to default package cache.
Results & Open Issues:
- Great throughput: 2,152 kb/s vs 783 kb/s;
- zypper list-updates doesn't give new required/recommended packages, so they are not present in cache with my routine. Any tip to get them as well?
- I'm not sure if wait -n returns to the same background function that dispatched a download request, or if any can capture the "wait", or all of them resume from a single process exit. This may lead to unbalanced MAX_PROC per repository, specially if a single & different process capture the wait. Does someone knows which primitive the wait is modeled after (mutex/semaphore, etc) or how it works when there's more than one wait?
- Also I'm not sure if I should just use the local cache or the system cache, although it's a minor issue.
Code:
#!/bin/bash
MAX_PROC=6
function repos_to_update () {
zypper list-updates | grep '^v ' | awk -F '|' '{ print $2 }' | sort --unique | tr -d ' '
}
function packages_from_repo () {
local repo=$1
zypper list-updates | grep " | $repo " | awk -F '|' '{ print $6, "#", $3, "-", $5, ".", $6, ".rpm" }' | tr -d ' '
}
function repo_uri () {
local repo=$1
zypper repos --uri | grep " | $repo " | awk -F '|' '{ print $7 }' | tr -d ' '
}
function repo_alias () {
local repo=$1
zypper repos | grep " | $repo " | awk -F '|' '{ print $2 }' | tr -d ' '
}
function download_package () {
local alias=$1
local uri=$2
local line=$3
IFS=# read arch package_name <<< "$line"
local package_uri="$uri/$arch/$package_name"
local local_dir="$HOME/.cache/zypp/packages/$alias/$arch"
local local_path="$local_dir/$package_name"
printf -v y %-30s "$repo"
printf "Repository: $y Package: $package_name\n"
if [ ! -f "$local_path" ]; then
mkdir -p $local_dir
curl --silent --fail -L -o $local_path $package_uri
fi
}
function download_repo () {
local repo=$1
local uri=$(repo_uri $repo)
local alias=$(repo_alias $repo)
local pkgs=$(packages_from_repo $repo)
local max_proc=$MAX_PROC
while IFS= read -r line; do
if [ $max_proc -eq 0 ]; then
wait -n
((max_proc++))
fi
download_package "$alias" "$uri" "$line" &
((max_proc--))
done <<< "$pkgs"
}
function download_all () {
local repos=$(repos_to_update)
while IFS= read -r line; do
download_repo $line &
done <<< "$repos"
wait
}
download_all
#sudo cp -r ~/.cache/zypp/packages/* /var/cache/zypp/packages/