4

What specific syntax must be changed in the cloud-init startup script excerpt below in order to handle the error message shown below by retrying something else until it correctly works without throwing an error?

Command That triggers Error:

The command in our startup script that seems to be triggering the error is:

dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

Error Message:

The error message seems to be:

azure-arm: Errors during downloading metadata for repository 'epel':
azure-arm:   - Status code: 503 for https://mirrors.fedoraproject.org/metalink?repo=epel-8&arch=x86_64&infra=$infra&content=$contentdir (IP: 123.45.678.901)
azure-arm:   - Status code: 503 for https://mirrors.fedoraproject.org/metalink?repo=epel-8&arch=x86_64&infra=$infra&content=$contentdir (IP: 123.45.678.908)
azure-arm:   - Status code: 503 for https://mirrors.fedoraproject.org/metalink?repo=epel-8&arch=x86_64&infra=$infra&content=$contentdir (IP: 98.765.43.21)
azure-arm: Error: Failed to download metadata for repo 'epel': Cannot prepare internal mirrorlist: Status code: 503 for https://mirrors.fedoraproject.org/metalink?repo=epel-8&arch=x86_64&infra=$infra&content=$contentdir (IP: 86.753.09.11)

The Context

A RHEL 7 image is being built in azure by packer using a cloud-init startup script. Normally, the build works correctly. However, right now, the build is failing when the line given below throws the error given below due to some dependency problem.

How would we need to re-write the lines around the line that is breaking in order for the install to complete without error?

Our requirement is to do the dnf install directly from a specific rpm file as below, but what do we change to keep the process from failing in the rare occasions when the url given for the rpm is not responding correctly?

The automation that includes the build takes a long time to run before it gets to the point where this error is thrown.

Handling this error would thus eliminate a lot of wasted time by preventing the scenario of needing to re-run long automation processes.

Results of @Haxiel's suggested code:

We tried the code suggested by @Haxiel in an answer posted below, but we got the following error as a result.

What specific syntax must be changed to resove this error to solve the original problem posted in this OP?

azure-arm: + for repourl in "https://fedora.cu.be/epel" "https://lon.mirror.rackspace.com/epel" "https://ftp.yz.yamagata-u.ac.jp/pub/linux/fedora-projects/epel"
azure-arm: + curl --silent --fail --max-time 5 https://fedora.cu.be/epel
azure-arm: + echo 'Repository reachable.'
azure-arm: Repository reachable.
azure-arm: + for repourl in "https://fedora.cu.be/epel" "https://lon.mirror.rackspace.com/epel" "https://ftp.yz.yamagata-u.ac.jp/pub/linux/fedora-projects/epel"
azure-arm: + curl --silent --fail --max-time 5 https://lon.mirror.rackspace.com/epel
azure-arm: + echo 'Repository reachable.'
azure-arm: Repository reachable.
azure-arm: + for repourl in "https://fedora.cu.be/epel" "https://lon.mirror.rackspace.com/epel" "https://ftp.yz.yamagata-u.ac.jp/pub/linux/fedora-projects/epel"
azure-arm: + curl --silent --fail --max-time 5 https://ftp.yz.yamagata-u.ac.jp/pub/linux/fedora-projects/epel
azure-arm: + echo 'Repository reachable.'
azure-arm: Repository reachable.
azure-arm: + sudo dnf --cacheonly -y install https://ftp.yz.yamagata-u.ac.jp/pub/linux/fedora-projects/epel/epel-release-latest-8.noarch.rpm
azure-arm: Last metadata expiration check: 0:11:50 ago on Mon 07 Feb 2022 05:36:46 PM UTC.
azure-arm: epel-release-latest-8.noarch.rpm                 37 kB/s |  23 kB     00:00
azure-arm: Dependencies resolved.
azure-arm: ================================================================================
azure-arm:  Package             Architecture  Version            Repository           Size
azure-arm: ================================================================================
azure-arm: Installing:
azure-arm:  epel-release       (B noarch        8-13.el8           @commandline         23 k
azure-arm:
azure-arm: Transaction Summary
azure-arm: ================================================================================
azure-arm: Install  1 Package
azure-arm:
azure-arm: Total size: 23 k
azure-arm: Installed size: 35 k
azure-arm: Downloading Packages:
azure-arm: Running transaction check
azure-arm: Transaction check succeeded.
azure-arm: Running transaction test
azure-arm: Transaction test succeeded.
azure-arm: Running transaction
azure-arm:   Preparing        :                                                        1/1
azure-arm:   Installing       : epel-release-8-13.el8.noarch                           1/1
azure-arm:   Running scriptlet: epel-release-8-13.el8.noarch                           1/1
azure-arm:   Verifying        : epel-release-8-13.el8.noarch                           1/1
azure-arm: Installed products updated.
azure-arm:
azure-arm: Installed:
azure-arm:   epel-release-8-13.el8.noarch
azure-arm:
azure-arm: Complete!
azure-arm: + sed -i '/^metalink.*/d' /etc/yum.repos.d/epel-modular.repo /etc/yum.repos.d/epel-playground.repo /etc/yum.repos.d/epel.repo /etc/yum.repos.d/epel-testing-modular.repo /etc/yum.repos.d/epel-testing.repo
azure-arm: + sed -i 's|^#baseurl.*|baseurl=https://ftp.yz.yamagata-u.ac.jp/pub/linux/fedora-projects/epel|g' /etc/yum.repos.d/epel-modular.repo /etc/yum.repos.d/epel-playground.repo /etc/yum.repos.d/epel.repo /etc/yum.repos.d/epel-testing-modular.repo /etc/yum.repos.d/epel-testing.repo
azure-arm: + dnf install -y telnet
azure-arm: Extra Packages for Enterprise Linux 8 - x86_64  205  B/s | 196  B     00:00
azure-arm: Errors during downloading metadata for repository 'epel':
azure-arm:   - Status code: 404 for https://ftp.yz.yamagata-u.ac.jp/pub/linux/fedora-projects/epel/repodata/repomd.xml (IP: 123.45.678.90)
azure-arm: Error: Failed to download metadata for repo 'epel': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
CodeMed
  • 5,079
  • 45
  • 100
  • 147
  • HTTP 503 indicates a server-side error, and there's not much a client can do about it except fail gracefully. However, the issue seems to be occurring when fetching the list of mirrors. You could skip this step and point the repo configuration to a specific mirror. Is this feasible for you? – Haxiel Feb 05 '22 at 03:04
  • @Haxiel Are you able to suggest where we would get the list of alternative mirrors and what the several lines of `cloud-init` bash code would need to look like in order to query the other mirrors if and only if the first one throws an error as above? I imagine this might be 5 to 10 lines of bash, but do not know how to construct those 5 to 10 lines. – CodeMed Feb 05 '22 at 03:09

1 Answers1

2

This is a proof-of-concept that I have come up with. You can use this as a starting point.

for repourl in \
"https://fedora.cu.be/epel" \
"https://lon.mirror.rackspace.com/epel" \
"https://ftp.yz.yamagata-u.ac.jp/pub/linux/fedora-projects/epel"
do
    if curl --silent --fail --max-time 5 $repourl &> /dev/null
        then echo "Repository reachable."
    fi
done

sudo dnf --cacheonly install "$repourl/epel-release-latest-8.noarch.rpm"

sed -i '/^metalink.*/d' /etc/yum.repos.d/epel*
sed -i "s|^#baseurl=https://download.example/pub/epel|baseurl=$repourl|g" /etc/yum.repos.d/epel*

We start with a list of EPEL repository mirror URLs. The available mirrors are listed at https://admin.fedoraproject.org/mirrormanager/. You can pick two or three mirrors which are geographically closest to you. I have picked three random mirrors here.

In a loop, we do an HTTP GET with curl to see if the repository server responds. The --fail option causes curl to fail when encountering server-side errors (HTTP 5XX errors). The --max-time 5 option allows the operation to last as long as 5 seconds, after which curl gives up.

Once we have a reachable repository (at least one of the three), we break the loop and install the epel-release-latest-8.noarch.rpm package from that repo. The --cacheonly option helps to avoid dependencies on any existing repositories.

Once the EPEL package is installed, we need to fix the URLs to point to the specific repository mirror. We do this by using a couple of sed commands. The first one gets rid of the 'metalink' property that points to https://mirrors.fedoraproject.org. The second sets the 'baseurl' property to the specific mirror that we have just verified to be working.

Once all of this is done, the EPEL repo is ready for use. Further dnf commands should work without issues.

Haxiel
  • 8,201
  • 1
  • 20
  • 30
  • The results of your suggestion were just added to the end of the OP. How do you suggest resolving the error produced by your code? Note that we had to add `-y` to the `dnf install` because this is run in a `cloud-init` startup script. – CodeMed Feb 07 '22 at 18:03
  • Will you be revising your answer based on the updated results posted at the end of the OP? – CodeMed Feb 08 '22 at 01:34
  • @CodeMed I made a change to the second `sed` command; this should be fixed now. The problem was each repository having a slightly different URL. Also, I don't see the `if` condition in your execution output. The selection should have stopped at the "https://fedora.cu.be/epel" repo, but it continued. Looks like your code is just falling through the list of repositories and selecting the last one. – Haxiel Feb 08 '22 at 04:30
  • Also, I can see that you want to install the `telnet` package, but that should already be available in the default RHEL repos (can't pin down which one) - you won't need the EPEL repo for it. This is just me trying to make sure that you have a dependency on the EPEL repo, since you're putting in a lot of work in setting it up. – Haxiel Feb 08 '22 at 04:32