archive builds on zfs /tank
Hydra is serving off the ZFS tank now, and things are mostly working, including the download links. So we now have 13TB of space to archive builds :)
Two issues remain:
- Would it be possible to check the system nix store first when hydra serves a store entry? Right now I had to
nix copythe whole
/tank/hydraso that old builds including dependencies are still accessible, which is quite unwieldy (especially considering the slowness and excessive memory consumption of
nix copy) and also wastes a few dozen gigabytes. This would also allow serving builds that did not go through Hydra. And, less importantly, the system nix store is on NVMe SSD which is faster than the cheap spinning rust of the ZFS tank.
- This broke the functionality from the RESTRICTDIST patch.
There is still some unexplained and potentially major breakage.
> nix-store -r /nix/store/0n0w7h88j13wdw2mimvghy0062r60ill-sinara-systems-99d3594 these paths will be fetched (0.00 MiB download, 0.12 MiB unpacked): /nix/store/0n0w7h88j13wdw2mimvghy0062r60ill-sinara-systems-99d3594 copying path '/nix/store/0n0w7h88j13wdw2mimvghy0062r60ill-sinara-systems-99d3594' from 'https://nixbld.m-labs.hk'... error 10 while decompressing xz file error: build of '/nix/store/0n0w7h88j13wdw2mimvghy0062r60ill-sinara-systems-99d3594' failed
Seems to affect those store entries that have been copied with
nix copy and not with hydra-queue-runner...
This corruption still eludes me.
Could please you attach a dump of the HTTP response which contains the actual corruption for the .nar.xz in hydra-bug.tar? Maybe a diff of these is going to hint at what is going on.
This corruption still eludes me.
What did you try?
Does spawning an Hydra instance with the files I posted in a
file:// store_uri fail to reproduce the problem?
Does spawning an Hydra instance with the files I posted in a file:// store_uri fail to reproduce the problem?
That is exactly what I did. Then I put them through Hydra/http and they decompress just fine.
I don't have a clue where this is being corrupted. There is no mangling of the .nar.xz payload. We have already tried opening the file in raw mode.
Please share a
curl -D- ...nar.xz > dump so that I can investigate the type of corruption.
I couldn't reproduce the issue with the original path - it now works normally. But now
/nix/store/b433sdzddsafbr7mzpjycys0bxhxzby4-openocd-mlabs-0.10.0 is affected. Hydra is serving an empty
Hydra is serving an empty nar.xz file.
Mh it actually sends empty files on any
/nar/xxx URL... what kind of error behavior is that.
Now there are errors such as
file 'nar/3mk294vsb4b62rg9g0f3ffna616i611a-python3.8-llvmlite-artiq-0.23.0.dev' does not exist in binary cache 'https://nixbld.m-labs.hk'
for things that actually exist.
What are you doing to cause such a path to be requested? Both filetypes (
/nar/*.nar.xz) have only a hash in their filename. Is this a client error?
Access patterns look like this for me (running
nix-env -i /nix/store/b433sdzddsafbr7mzpjycys0bxhxzby4-openocd-mlabs-0.10.0, running nginx in the middle):
10.23.23.6 - - [23/Feb/2021:00:01:18 +0000] "GET /nix-cache-info HTTP/1.1" 200 52 "-" "curl/7.70.0 Nix/2.3.6" 10.23.23.6 - - [23/Feb/2021:00:01:18 +0000] "GET /b433sdzddsafbr7mzpjycys0bxhxzby4.narinfo HTTP/1.1" 200 816 "-" "curl/7.70.0 Nix/2.3.6" 10.23.23.6 - - [23/Feb/2021:00:01:23 +0000] "GET /nar/1j8bmi2m18rpfkpgsyf5mzg58gdi7xvbxvcn3lkkihh7lz72cvy3.nar.xz HTTP/1.1" 200 1281528 "-" "curl/7.70.0 Nix/2.3.6"
What are you doing to cause such a path to be requested?
nix-shell on a regular
shell.nix file to install ARTIQ. The error message is as printed by Nix, I do not know if it corresponds to the actual URL.
New server arrived with root-on-ZFS.
Deleting a branch is permanent. It CANNOT be undone. Continue?