Rendered at 23:36:09 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
lisper 9 hours ago [-]
> There are a shocking number of ways to accidentally create nondeterministic output when doing C/C++ development. One of the easiest is to use the builtin __DATE__ and __TIME__ macros to stamp a build with the time the compiler was executed at:
Am I missing something here? Yes, if you use a feature that intentionally inserts the build time and date into the code, the every build is going to be different. That's the whole point of these macros. It's a feature. If you don't want that behavior, don't use that feature.
yapfrog 3 hours ago [-]
"One of the easiest"
It's meant to be a trivial counterexample. Like saying "-1" to the claim "there's no number smaller than 0" to someone who's not familiar with math, the author is saying "build-dependent macros" to the claim "compilers are deterministic" to someone who might not be familiar with compilers.
lisper 44 minutes ago [-]
A trivial counterexample to what? The antecedent of "one of the easiest" was "ways to accidentally create nondeterministic output." One even the most charitable reading I can muster that still seems to me to presume some pretty abject stupidity.
xena 9 hours ago [-]
Surprisingly, cherry-picked examples to prove a point are cherry-picked examples to prove a point.
fasterthanlime 8 hours ago [-]
I think the intentional part is that you want to print the date and time something's been compiled, and the accidental part is that you suddenly made your build non-reproducible.
But usually the realization follows the initial intent by several weeks, if not months! Your comment shines as the embodiment of hindsight is 20/20.
lisper 7 hours ago [-]
> the accidental part is that you suddenly made your build non-reproducible
But that's exactly what I don't get. How can that be considered "accidental"? How can any thinking person not realize that putting the build time into the compiled image will make every build different because, you know, different builds happen at different times? Has software engineering really been dumbed down so much that this is not immediately obvious? It feels like a mechanic doing an oil change and being surprised by having all the oil drain out if they neglect to put the drain plug back in.
Neywiny 7 hours ago [-]
With you on this especially because somebody I know asked me to help change their oil and they hadn't even considered there being waste oil to dispose of
NobodyNada 6 hours ago [-]
The engineer who wants the build to be reproducible and the engineer who wants to have the build time in the compiled binary may not be the same person.
lisper 6 hours ago [-]
Sure, but that is a completely different issue. People have mutually-conflicting goals on occasion. That is a Thing That Happens, but it is a very different phenomenon than being surprised by the obvious fact that putting a time stamp on your build makes that build non-reproducible.
eddd-ddde 6 hours ago [-]
I don't think these kind of features belong in a compiler. If you want a stamp then pass it in to your compiler invocation via explicit defines.
You might accidentally end up including it transitively and suddenly your binary is nondeterministic.
inigyou 12 hours ago [-]
Better title: Reproducible builds are hard
arikrahman 9 hours ago [-]
Unless you use a flake or the project has a flake.nix
yencabulator 8 hours ago [-]
There is nothing in Nix that magically makes concurrent builds with completion-order-dependent outputs come out deterministic.
jdw64 18 hours ago [-]
Reading this, I think low level engineering is actually more dependent on specific environments. Hardware also has its own points of change. Usually, when you think at a high level, environmental changes are less significant than you might expect. But low level thinking tends to be tied to specific environments, which is what makes it difficult. The reason low level is hard is that even if the code itself is short, the hidden assumptions inside it are difficult and place a heavy cognitive load on the programmer. For example, even a short snippet in C like
`int value = (int)buffer`
requires a lot of implicit knowledge about the 4 byte alignment of the buffer, or whether int is exactly 32 bits. LLMs do not seem to be very good at knowing these things. Rather, they are strong at high level wrapping, but at the low level, they seem surprisingly difficult and somewhat useless. Hardware has CPU generation changes, and in the case of PLCs, where I mainly work, the protocol differences between vendors are far too severe. There does not seem to be any technology with a very long lifecycle.
jstimpfle 17 hours ago [-]
Depends on what you mean by low level I guess. Compared to web application framework churn rate, simple procedural programming without many dependencies is remarkably stable. You tend to program in a way that works for most platforms (all targetted platforms). How to best do that you learn over the years. To me personally it's very refreshing if the environment around you does not constantly change. That affords learning a bag of tricks and a list of gotchas to avoid.
jdw64 16 hours ago [-]
I think you're right too. So I also think that maybe I'm viewing the changes as bigger than they actually are, based on my own standards
embedding-shape 13 hours ago [-]
> You tend to program in a way that works for most platforms (all targetted platforms).
Isn't that true for web frameworks too? Usually they'll only target unix, but if they target windows and macos, then they work on those platforms too? Or am I misunderstanding what you're trying to say here?
jstimpfle 13 hours ago [-]
This is how I mean it: In case of low level programming, the "platform" is the hardware/OS/compiler. In case of web programming, the "platform" is the web framework.
If you update the OS, hardware, or compiler, you will see only few changes. If you update the web framework, you may see breakages, API deprecations or whatever. You may want to move to a different web framework entirely. TBH I don't really know, I don't know web programming beyond basic HTML/Javascript. That's what they say, though.
embedding-shape 13 hours ago [-]
Well I mean you're comparing two different solutions at different layers here.
In the case of an desktop application, unless you build things against OS libraries, your "platform" is also typically a framework, like QT or AppKit or whatever you end up using. That's the equivalent of the "web framework" in the web world.
Basically, it goes "Your app > GUI framework > other/OS libraries" for desktop apps, "Your app > web framework > other/OS libraries" for web applications.
Then in both approaches you can of course skip the framework if you want, no one is forcing you to use those in either of the cases.
Edit: I realize now we might be talking past each other, I was under the understanding that "web framework" is about backend web frameworks, but maybe you actually meant frontend frameworks running client-side. If so, replace "other/OS libraries" with "browser runtime" and my comment more or less still makes sense :)
jstimpfle 12 hours ago [-]
> your "platform" is also typically a framework, like QT or AppKit or whatever you end up using
That's not what I consider "low level programming". I don't use any of these.
Yes you can do try and do plain Javascript. Honestly Javascript is a much less pleasurable environment than a compiled statically typed procedural language. The main advantage of the browser is you get a viewport, you get font rendering etc. with almost no setup required at all.
embedding-shape 10 hours ago [-]
> That's not what I consider "low level programming". I don't use any of these.
So say C linking to Xorg-libraries and drawing GUI that way isn't low level programming, then what is? Only assembly is "low level programming" or what?
Meh, JavaScript is fine, like most dynamic Algol/C-like languages. Could be worse, could be TypeScript :)
But personally, browser environment is a hell of a lot easier to target than doing cross-platform native application development, but I'm a web developer who started doing native apps, not the other way around, might be why.
7 hours ago [-]
AnimalMuppet 13 hours ago [-]
More: If you upgrade the hardware or the compiler, you upgrade them. If you're doing web programming, you have to worry about the user upgrading their browser.
jstimpfle 12 hours ago [-]
It's not so much about the browser (I'm not aware of major incompatibilities introduced by new browsers or new W3C standards). But the software ecosytems (like frameworks, or node.js) that web people are relying on in order to create their web apps.
Dwedit 12 hours ago [-]
Looks like the formatting ate your asterisks at *(int*)buffer. Use \* to get an asterisk.
jdw64 11 hours ago [-]
Since edits aren't allowed after a certain amount of time, I'll keep that in mind next time. Thanks!
antirez 12 hours ago [-]
So to avoid those energy-hungry LLM companies from scraping your website, you force each browser to compute a lot of hashes in a necessarily energy-hungry loop, creating, at the same time, all the kind of accessibility problems?
ericpauley 11 hours ago [-]
I don’t get how people believe there’s a PoW function that both:
1. Allows access in reasonable time/battery use to me on my phone
2. Poses any meaningful challenge to the most compute-resourced organizations on the planet
I wonder how many cumulative hours of human life have been wasted waiting on Anubis.
beached_whale 10 hours ago [-]
There are a lot of people writing really bad scrapers and running them on far from high compute power systems. This is the prevent DoS because of those. The big companies are often far more clever and know they are traversing the whole internet and can come back later.
grayhatter 10 hours ago [-]
> I wonder how many cumulative hours of human life have been wasted waiting on writing comments on creamsicle reddit.
I disagree with a lot of the decisions around the design of Anubis... but resisting the current drive of the industry to ruin as much of the good faith resource donations from others is an admirable objective.
The point isn't to increase the amount of work required to the point of exhaustion, it's to require that scripts be able to offer the exact same feature set that browsers offer. The point isn't to make it impossible, it's too make it more expensive than free.
Anubis isn't trying to prevent all scraping, it's trying to reduce the abuse just enough that real requests get their fair share. You don't need to outcompute the botnet just slow them down a little.
I hate seeing the Anubis interstitial too, I've complained about it publicly already too. But it doesn't come close to the frustration of waiting 10s for an SPA to load all of the routes it'll never use before the first redraw. Clearly our industry has also decided latency is a good thing.
lifthrasiir 11 hours ago [-]
The vast majority of that compute is locked in AI accelerators that do the inference. Those hardwares are bad at doing anything other than that---in fact crawlers would need more residential proxies than more computes in that regard.
11 hours ago [-]
Analemma_ 8 hours ago [-]
> I wonder how many cumulative hours of human life have been wasted waiting on Anubis.
"How dare that mugging victim fight back".
The choice is not between Anubis and no Anubis, the choice is between Anubis and my website going offline because I can't afford the $400/month that AI scrapers would cost me (yes, I checked, and yes, that's the real figure) if Anubis wasn't in front.
frollogaston 4 hours ago [-]
That makes sense, and I believe you, I'm just surprised it really deters the scrapers.
xena 4 hours ago [-]
If it's dumb and it works, is it really dumb?
frollogaston 4 hours ago [-]
No it's not dumb, but I don't get how it manages to be so light still. Like I visit an Anubis-guarded site and barely have to wait. Scrapers really see that little CPU usage or wall time and back off? Or maybe that's just cause I'm not visiting sites that are under attack.
saintfire 2 hours ago [-]
It chooses the challenge weight based on signals. If your phone looks like a phone from a residential IP you get a simple challenge.
If you then spam requests you might get another, harder, hallenge appear.
If you have a data center IP and look like bot traffic you get a hard challenge out the gate.
AFAIU after looking at their docs several months ago.
frollogaston 33 minutes ago [-]
Ah, knowing the type of IP is really advantageous by itself
hootz 11 hours ago [-]
Not just LLM companies, but bots in general. They were a big problem even before LLMs.
bmacho 8 hours ago [-]
They have 2 options:
- Put their ~1kb of text on a ~0kb website, make it cacheable, make hosting it free, make downloading and rendering it instantenous, make it accessible and let users read it comfortably
- Set up a CAPTCHA and make the website inaccessible, spy on the users or give their history to trillion dollar ad companies, make them wait 10 secs to proceed.
Guess which one HN front-page bloggers choose? I often comment and/or flag them, but they never learn.
TazeTSchnitzel 5 hours ago [-]
Anubis doesn't rely on spying on the user.
edude03 12 hours ago [-]
Could have sworn the author was a nix(os) user already. I know it’s a meme but what all the problems they’re describing literally is solved by nix. The nix sandbox even catches calls for time for example to replace it with 0 for determinism.
trexd 11 hours ago [-]
They stopped using Nix due to disagreements with how the project was run.
Weird article. The author is clearly heart-broken about some changes in how Nixos is managed - but what those changes were and why the author dislikes them is left completely unclear. The link to the Determinate blogpost doesn't clarify anything. I guess it might make sense to Nixos insiders...
drunner 9 hours ago [-]
From what I gather you fear determinate systems will guide future Nixos dev like google steers chromium or am I way off base?
xena 10 hours ago [-]
Nix doesn't help when the issue is a compiler bug, sadly.
arikrahman 9 hours ago [-]
What about Guix? May be more their tempo.
arikrahman 9 hours ago [-]
This was my first thought, use a declarative system like Nix.
ComputerGuru 18 hours ago [-]
These seem very reasonable, the workarounds used are natural, and overall the article is not at all congruous with the conclusion in the (clickbait?) title?
Compilers literally made your project possible!
zeratax 14 hours ago [-]
> Clang relies on address layout for ordering things
I would consider that a bug tbh
j2kun 11 hours ago [-]
What is kind of annoying is that the author jumps to "I hate compilers" instead of "I will report/help fix this bug upstream."
chiyc 10 hours ago [-]
I don't get the sense they hate compilers at all. The writing describes work they seem to love doing. It's just clickbait.
And it may not have crossed their mind that the clang behavior is a bug after finding a workaround. I'd also assume compilers do things "no mere mortal can fully comprehend on their own".
xena 10 hours ago [-]
This might be the first time in my career I have genuinely found a compiler bug. I've been operating under the axiom of "don't assume it's a compiler bug, assume you're fucking it up somehow". When you get to the point that disabling ASLR makes it consistent intra-host I think I've won the right to at least suspect a compiler bug is at play.
I'll go file it upstream after work today.
ammar2 5 hours ago [-]
If you feel like increasing your power as per your post, this is a somewhat decent first LLVM issue, take a look at WebAssemblyCFGStackify.cpp :)
llvm/test/CodeGen/WebAssembly/cfg-stackify-eh.ll and friends are existing tests that you can kinda mangle if you want to get a good reproducer.
Otherwise, happy to put my reproducer/patch on the bug after you file it!
xena 5 hours ago [-]
I'm gonna have to file the bug without a minimal reproduction case. The issue seems to be those try_table blocks getting nondeterministically reordered at link time (is it using machine pointers for iteration order?). Sadly I'm observing this with a local checkout of binaryen, so it may take a while for you to find the minimal reproduction case.
StilesCrisis 11 hours ago [-]
If reproducible builds is a stated goal for Clang? I'm not sure that it is. If so, absolutely a bug.
Nix also needs the build output to be deterministic to calculate the hash. It also has the problems of timestamps etc. The build environment tries to be hermetic by setting the time to be epoch among other things.
mplanchard 12 hours ago [-]
Yes, reading this I was thinking about how many of these problems go away with a nix environment. Certainly not all of them, but it’s a great way to get a reproducible build environment that includes direct specification of system dependencies.
Nix hashes the build inputs, for which deterministic builds are not required, only desirable.
biglost 17 hours ago [-]
Time date env variables and random address... Is also input data, maybe not as a flag but still
RyanSquared 17 hours ago [-]
Time and date are... tolerable. There's SOURCE_DATE_EPOCH which should always be set to whack it into submission when used. ASLR of the _compiler being invoked_ resulting in a difference in the _program being compiled_ is nuts and would break any self-hosting compiler with consistency checks.
xiphias2 10 hours ago [-]
ASLR is important, but you should be able to provide the random seed (I don't know if it's possible or not)
yjftsjthsd-h 17 hours ago [-]
Explicit is Better than Implicit.
swiftcoder 17 hours ago [-]
The Birth and Death of Javascript really had the gift of prophecy, eh
neocron 15 hours ago [-]
Was that the thing where Gary predicted js in the kernel?
frollogaston 4 hours ago [-]
How about hardware accelerators to interpret JS? At least it wouldn't break, cause modern JS all transpiles to the old one.
SSLy 8 hours ago [-]
bpf isn't a far stretch from that
swiftcoder 14 hours ago [-]
yes, although more directly relevant here, chrome-compiled-to-wasm-nested-in-chrome
yencabulator 8 hours ago [-]
Firefox's font rendering is sandboxed by being C/C++-compiled-to-wasm-then-aot-compiled-to-native-code. Yavascript is dead, all hail WASM.
I’m still surprised by Anubis’ decision not to make the PoW have a useful output, for example a crypto, protein-folding like, or something else.
And I speak as being generally very critical of cryptos, but here rewarding the website owner with some cents to have access seems fair, and resolves the traditional issues about micro-payments.
frollogaston 4 hours ago [-]
I'm not even convinced that this works well for users who want to run that work on their machines. Distributed cluster of random untrusted hardware has got to be so much less efficient than a purpose-designed datacenter cluster, and harder to manage. I get that any "free" compute is cheaper than paid compute, but at some point the gap is so wide that the coordination alone doesn't make sense.
Wasn't there some famous home-computing project that recently stopped because of that? I thought it was Folding@home but that seems to still be going.
xena 10 hours ago [-]
Patches welcome. I'd love to do protein folding too.
omoikane 7 hours ago [-]
I would think twice before adding those type of features to Anubis. I like how Anubis currently does one thing and do that well. Once you make a captcha-like service that also does useful work, users will eventually perceive it as a useful-work-service that happens to have captcha-like function on the side, and that new perception will get a lot more people upset about Anubis.
We see this with Recaptcha where when it was first launched, some news sites praised it as making good use of what would have otherwise been wasted human effort. But eventually I started to see negative comments along the lines of how Recaptcha is just extracting free work to train self driving cars, nevermind the part about stopping bots. Since Recaptcha is now sometimes non-interactive, I am not sure if that data is still used for training, other than to improve Recaptcha itself, but the negative sentiment still holds whether that data is used or not.
zadikian 4 hours ago [-]
I agree, also making it do useful work could detract from how well it works as bot-prevention. Recaptcha suffered from its own success eventually, having to ask people to read nearly impossible things, and now many sites are instead asking me to solve puzzles with no practical use.
P.S. Great to see you, omoikane
xena 4 hours ago [-]
The last time I evaluated doing protein folding it required gigabytes of disk-data to make it happen. If there's a way to do it without needing gigabytes of disk-data, I'm at least interested in hearing it out.
YoshiRulz 4 hours ago [-]
The original reCAPTCHA which was used to train OCR came out back when Google was at least pretending to not be evil, hence the favourable coverage about old books. Now that the challenges are used to train Waymo cars (citation needed, but obviously they won't be sharing the data), and Google is definitely not tracking everyone with it (according to... Google, the adtech company), there's no positive spin you could possibly put on it.
Were Anubis to add crypto mining, even if all the revenue went to Techaro, you could still say "the enshittification is a shame, but at least they're not Google". Using the compute for BOINC protein folding somehow should be unobjectionable.
KolmogorovComp 9 hours ago [-]
what about something (crypto being the only idea I have) that could be leveraged to reward the website owner?
it does not answer if it's a conscious design choice or just a technical limitation of the PoW, or just a "patch welcome" issue.
xena 9 hours ago [-]
It's a conscious design choice. Doing that gets you considered a botnet/malware. I don't want Anubis to be considered a botnet/malware.
ssl-3 8 hours ago [-]
I already consider Anubis to be malware and I'm just some dude who likes to play with computers.
If it mined crypto instead of just burn clock cycles, then that could not in any way serve to lower its ranking in my book. It's already at minimum.
xena 8 hours ago [-]
That's great. I love that for you. Please leave me alone.
KolmogorovComp 9 hours ago [-]
OK, I don't think it would be the case if it was one of many algorithms offered by Anubis, especially if it's enabled by the webmaster.
That being said I don't know any crypto that would technically fit as a lightweight PoW.
8 hours ago [-]
j16sdiz 10 hours ago [-]
> What do you do when the client has WebAssembly disabled?
Do people really do that? -- disable, not just using old browsers with no wasm.
Disabling wasm while keeping js enabled is a configuration i can't understand
pohl 10 hours ago [-]
It’s easy to imagine an organization with a paranoid security posture ending up with that configuration because they decided to only enable a minimum necessary feature set where they determined that JS was necessary while WASM was not.
xena 10 hours ago [-]
iOS Lockdown Mode also implements this configuration.
pertymcpert 18 hours ago [-]
If Clang generated non-deterministic output due to pointer addresses then that's a bug (happens regularly) that should be fixed. The most common way this happens if it some code path is iterating over a DenseMap which is non-deterministic. Sometimes that's fine and sometimes that's not depending on how that map is used. The common way to fix that is to switch to a MapVector which pays some additional runtime/memory cost to guarantee deterministic iteration order.
xena 17 hours ago [-]
I'll try and make a minimal reproduction case and file a bug. Do you know if any tooling that can take a binary and fuzz it down to a minimal reproduction set?
Claude code is actually rather good at this. If your initial testcase is not too big, you can use creduce or cvise.
xena 10 hours ago [-]
Sadly my initial test case is binaryen which has 290 compilation units. I will try though.
3dedb728-3f77 4 hours ago [-]
Hey, is the author alright?
Why does he need to induce the reader in leap of logic with small characters.
Mond_ 45 minutes ago [-]
It's their own blog and they can do whatever they feel like to break up the flow. I think it's quite charming tbh
randusername 12 hours ago [-]
I've seen posts by this author before and did not understand if the commentary characters were referential or a creation of the author. Turns out its the latter. I dismissed the underlined names as just styling, not hyperlinks.
I hate proof of work code running on my machine for the benefit of someone else. It's like planting a crypto miner.
tengwar2 12 hours ago [-]
Yup, and I suspect that even if OP is honest in this respect, if proof-of-work gets established as a normal practice for web pages, it's going to be used this way.
But just taking this as-is, what is the environmental impact likely to be when multiplied up by the number of users? Proof of work is a bad idea.
ctrlmeta 12 hours ago [-]
Do proof-of-work pages actually stop AI bots? Big AI companies have enough compute to solve these challenges at scale. And if their bots are already doing much heavier work to fetch, read and process each page, then solving a small challenge first seems unlikely to be a serious barrier. Who are these proof-of-work challenges actually helping?
titularcomment 10 hours ago [-]
I believe poisonous loops of non-sense text are the best choice in terms of LLM capabilities and human distinguishing potential; the next iteration could be non-sense with reasonably intact grammar and content. At the very least, show some content whilst doing the PoW (isn't the point rising computational costs? Give me something like YouTube video decryption instead of having me wait at least) OR use the PoW for some useful protein-folding, finding the next prime, or an alternative monetization scheme.
Getting in the maze influences your client's challenge difficulty.
8 hours ago [-]
account42 13 hours ago [-]
Yes, all these kind of bot checks are essentially malware.
hueho 10 hours ago [-]
The "benefit of someone else" in this scenario is the site operator not having their website down or their hosting bills unsustainable because of misbehaving (which are 99% of current) web scrapers from AI companies.
I tried doing that at first. I kept running into edge cases that made the whole thing fall to ribbons. I gave up and am just falling back to what I know works: compiling the WASM to JS.
znpy 16 hours ago [-]
> What do you do when the client has WebAssembly disabled?
> I decided to take inspiration from the legendary talk The Birth and Death of JavaScript and just recompile the WebAssembly to JavaScript.
So what do you do when the client has Javascript disabled ?
To avoid all those grotesque and absurd compilers and runtimes, more for those of computer languages with a ultra-complex syntax (c++ and similar), I now design "binary specifications" which I "design" and "validate" with RISC-V assembly coding.
Here, since any whatwg cartel web engine is an issue, the author should not bother.
I'm surprised by the amount of heckling this post received almost immediately! And a lack of constructive input.
I for one enjoyed the article and understand what you're getting at.
yjftsjthsd-h 17 hours ago [-]
> This is the goofiest I've seen written unironically in quite a long - the C preprocessor is not part of the compiler. The pre in preprocessor should probably give it away.
This is true but doesn't seem relevant; does replacing the word "compiler" with "build chain" change anything? Because that seems like the clear meaning.
LPisGood 17 hours ago [-]
Re: source code producing different binaries: things like ASLR, stack canaries, optimization levels, linking, etc all lead to different binaries.
ekjhgkejhgk 13 hours ago [-]
[flagged]
charcircuit 18 hours ago [-]
As long as the program is equivalent there isn't an actual problem here. Requiring the output to always be the same is an arbitrary restriction.
If you want to have users trust that someone else hasn't modified it, then sign it with your identity.
yjftsjthsd-h 17 hours ago [-]
We'd like to verify, not trust.
charcircuit 16 hours ago [-]
The whole point of a signature is that you are able to verify what was signed was in fact a message that was signed by signer.
robinsonb5 16 hours ago [-]
Sure, but a signature doesn't prove that a particular binary came from a particular codebase - merely that a particular human (or other trusted entity, for varying degrees of "trusted") has vouched for it.
Being able to reproduce the binary from the source code and being able to verify that it's the same as the original is quite important in some contexts.
charcircuit 16 hours ago [-]
>Being able to reproduce the binary from the source code and being able to verify that it's the same as the original is quite important in some contexts.
I disagree. The contexts that people come up with are purely theoretical, and are not practically important. Please do try and convince me otherwise by sharing such a context. From my view the juice of trying to accomplish this is no where worth the squeeze.
harrouet 12 hours ago [-]
You disagree but you're wrong.
Military context: a government would want to review the code and compile themselves. Provide a hash of the target binary to ensure they've compiled it correctly.
SDLC: provide auditors with _proof_ that the tested binary is indeed coming from the audited code
charcircuit 9 hours ago [-]
>a government would want to review the code and compile themselves. Provide a hash of the target binary to ensure they've compiled it correctly.
The government doesn't want to do this. A lot of the time the government doesn't even get the source code in the first place.
>provide auditors with _proof_ that the tested binary is indeed coming from the audited code
This can be done by showing to the auditor how one's CI is setup to build checked in code and sign it.
skydhash 12 hours ago [-]
Military Context: Just build the code that you just reviewed. No need to get the binaries
SDLC: Traceability is more important than reproducibility. Keeping logs is more important than deterministic build outputs
skydhash 12 hours ago [-]
> Being able to reproduce the binary from the source code and being able to verify that it's the same as the original is quite important in some contexts
Why not build your own binaries and be done with that. If you don’t trust the compiler or the machine doing the build, just build the code yourself.
robinsonb5 6 hours ago [-]
Sure, I can do that, but there's some value in being able to check quickly and easily that, for example, the xz utils binaries shipped by a major distro actually match the published source.
Also useful for checking that a binary containing GPLed code does actually correspond to its published source.
skydhash 5 hours ago [-]
The capability may be nice to have, but what about its usefulness. Would that have been of use in any real world situation?
dyauspitr 18 hours ago [-]
LLMs should be trained on and directly output binary.
klodolph 18 hours ago [-]
On the off chance that you’re serious, that would result in disastrously bad output. The difference between “jmp $+15” and “jmp $+16” is inscrutable and the LLM would not be able to pick the right one without tooling.
That tooling is a compiler. The higher level, the better chance the LLM can be steered to good output. Machine code is hopeless, don’t bother.
torginus 7 hours ago [-]
> The difference between “jmp $+15” and “jmp $+16” is inscrutable
Just like the difference between 'him' and 'her' is inscrutable taken out of context, but that's why LLMs have embeddings they use to store contextual information in huge vectors and have an input processing phase during which the input tokens gain contextual information, so that the LLM knows that 'him' refers to 'Peter' and 'her' refers to 'Jane'. Likewise it will be able to infer that $+15 is the 'success' branch of control flow and $+16 is the fail branch.
The way computer programs and natural language differ, is that in language, words with absolute or at least very constrained meanings are common, while code, is basically a pure manipulation of symbols, with variable and function names being meaningless helpers, and the actual meaning needs to be deduced from the way these symbols are manipulated.
In fact, I think LLMs are actually surprisingly good at this kind of abstract symbol manipulation, and are far less bothered than humans with 'add rax, rcx' by the fact that the meaning of 'rax' and 'rcx' are heavily contextual, as they dedicate a lot of time to build up rich contextual information that might be different in every place these symbols appear.
klodolph 4 hours ago [-]
> Just like the difference between 'him' and 'her' is inscrutable taken out of context,
The context is pretty flexible, like "Do you know Jim? I saw him at the store." Or, "Do you know Jim? Fifteen days ago, I saw him at the store." There’s a relatively small universe of pronouns (him, her, that, who, etc) and the pronouns refer to a token nearby (in this case, Jim).
With machine code, there’s a massive set of jump offsets, and the referent isn’t a token, but rather a location to start processing.
> In fact, I think LLMs are actually surprisingly good at this kind of abstract symbol manipulation,
When you’re manipulating machine code, you’ve stepped away from abstract symbol manipulation and you’re just manipulating byte values now.
I don’t think your argument here is convincing. Maybe you can point to a demo or some architecture where this works. But my sense is this—once you start designing a harness to make LLMs capable of writing machine code, or designing an architecture for LLMs to write machine code, something in your implementation probably looks like an assembler, and something in your internal tokenization of the machine code probably looks like a higher-level language.
pjmlp 18 hours ago [-]
That compiler does wonders with languages that have UB on their specs, especially when having optimizations passes with heuristics.
Also there are dynamic compilers were the shape of machine code changes as the code executes, and each single execution will certainly generate different sequences, depending on the program execution and where it is running.
Deterministic JIT compiler code generation, at least on optimising ones, is not a solved problem.
faangguyindia 18 hours ago [-]
What about AOT optimization? whuch brings aot closer to JITs performance? Isn't that something LLM + Harness can easily do?
klodolph 17 hours ago [-]
I think the idea that AOT is inherently faster than JIT, or vice versa, is a thoroughly debunked idea.
You can have LLMs help you optimize code but I don’t think you can do this unattended for non-trivial code.
jenadine 18 hours ago [-]
> The difference between “jmp $+15” and “jmp $+16” is inscrutable
I don't see why that's the case.
LLM trained on binary would totally see it, not?
Also the tool can also be running the test and a debugger.
klodolph 17 hours ago [-]
> I don't see why that's the case. LLM trained on binary would totally see it, not?
It would not. You find the correct version by counting the number of bytes to the destination. LLMs are famously bad at this kind of problem (counting).
> Also the tool can also be running the test and a debugger.
The test needs to provide a good amount of signal. That’s too hard if you are throwing machine code at the wall.
In order for debuggers to work, you need some kind of model that describes what the code should do and what state the computer should be in after each instruction. That model is high-level code.
I can understand the intuitive appeal of training LLMs with machine code, but all of my experience with LLMs suggest that they are incredibly ill-suited to the task, and we just don’t have the capacity to train them to make useful machine code.
zx8080 17 hours ago [-]
Can "LLMs are bad at counting" be generalized to "LLM are better in complex stuff but make more mistakes in simple"?
fluoridation 17 hours ago [-]
I would phrase it as "LLMs are good at big picture stuff and bad at fine detail", or to put it another way, they're accurate, but imprecise and with low reproducibility.
bregma 13 hours ago [-]
It is my experience that it's the opposite. LLMs are very very precise but wildly inaccurate. They might give you 17 significant digits but be off by 10 orders of magnitude, to use a metaphor.
fluoridation 8 hours ago [-]
Sounds like we're in agreement, then. The 7 digits it got correct are the big picture, and the rest are the details. Are you disagreeing with my statement or with my usage of "accurate" and "precise"?
benj111 13 hours ago [-]
But where does that leave us when programmers treat themselves as architects with the AI doing the drudge work? As seems to be the fashion.
It then means you have 2 parties focussing on the big picture and no one focussing on the details.
fluoridation 5 hours ago [-]
I said "big picture stuff", but I guess I should have said "broad strokes". The truly correct answer is probably similar to what the model will answer, and if your problem is such that it can work with small imperfections in a solution, then the LLM helps. If the solution needs to be exactly right, then it will probably fail.
Yesterday on a whim I tried asking a local model a question about kanji that look different in different fonts despite being the same character (to the point of strokes appearing in completely different directions), and the model hallucinated imgur links to images of the characters. If imgur could work with approximate references to data maybe that would have worked.
ozlikethewizard 17 hours ago [-]
Its more LLMs are better at vague problems with multiple non perfect solutions, and struggle at problems that require precision.
klodolph 17 hours ago [-]
No, I don’t think so. LLMs are good at a lot of simple tasks, but bad at certain simple tasks. Moravec’s paradox in a new iteration.
It applies to humans too. Calculus is “simple” but it takes something like sixteen years to train a human to do it, if all goes well. Meanwhile, most humans think that inverse kinematics is, like, the easiest thing in the world (it’s a super complicated task).
fluoridation 17 hours ago [-]
Calculus is definitely the harder task, considering it took a species developing the cognitive capacity for symbolic reasoning for it to show up, whereas any animal can figure out how to position its limbs. Yeah, we figured out how to make CAS programs before inverse kinematics software, but that's because computers were made to solve numerical problems, not to replace the cerebella of chordates.
klodolph 11 hours ago [-]
> Calculus is definitely the harder task,
You’re only evaluating “harder” or “easier” based on the perspective of somebody who has a mammalian brain with millions of years of selective pressure to make it suitable for solving inverse kinematics problems.
The point here is that when we start constructing agents or tools with different architectures to ourselves, it makes sense to reevaluate notions of whether something is ‘hard’ or ‘easy’. LLMs are bad at counting not because counting is hard, but because their architecture makes it hard.
fluoridation 8 hours ago [-]
I'm evaluating them using an objective metric, which is how long each took to arise in the universe. It could have never been the case that calculus arose before inverse kinematics, because a thing like that could not interact with the real world.
Also, I suspect you're comparing dissimilar things, because in one case you're looking at a brain doing both inverse kinematics and "calculus" (sense 1), and in the other you're looking at a computer doing both inverse kinematics and "calculus" (sense 2). The kind of calculus a CAS does is not the same kind that a human does. It's less versatile, for one.
>The point here is that when we start constructing agents or tools with different architectures to ourselves, it makes sense to reevaluate notions of whether something is ‘hard’ or ‘easy’.
Well, no, because when someone says that calculus is hard and moving their arms is easy, they're not talking about how hard it was to create each functionality, they're talking about how hard it is to employ each. We would need to ask a computer how hard it thinks the tasks it does are to do.
klodolph 6 hours ago [-]
> I'm evaluating them using an objective metric,
I don’t think the metric is at all reasonable, and the fact that it’s “objective” doesn’t make up for its other shortcomings. I don’t think we have a basis for agreement here—I think you’ve framed the argument in a way that supports a “calculus is hard” conclusion merely by defining “hard” in such a way that supports your conclusion from the start, but I think that approach is only useful as a way to win an argument, and we’ve failed to share ideas once you start using that tactic.
fluoridation 6 hours ago [-]
>I think you’ve framed the argument in a way that supports a “calculus is hard” conclusion merely by defining “hard” in such a way that supports your conclusion from the start
It seems to me you're the one who first did that by equivocating what is easier to do and what is easier to make a machine do.
>we’ve failed to share ideas once you start using that tactic
Well, I certainly don't agree with that.
dezgeg 15 hours ago [-]
Even if it could, it would be ridiculously token inefficient to update huge amount of addresses instead when some small change is done to the middle of a binary
xiaoyu2006 18 hours ago [-]
It should not. Abstraction in software engineering brings intelligence. (compression correlates to intelligence)
frwrfwrfeefwf 17 hours ago [-]
people don't get this
dyauspitr 18 hours ago [-]
Why? I mean this is all emergent, right? And it’s not like humans ever work at this level. It would be very interesting to see what sort of outputs and abstractions an LLM comes up with.
shshshjaja 18 hours ago [-]
[flagged]
bandrami 17 hours ago [-]
Generative algorithms have been studied for decades now and while they have led to some interesting results they're a bad fit for LLMs because there's no such thing as a "plausible" binary: a small perturbation yields an unusable result.
fulafel 17 hours ago [-]
Technically they are, just a subset. But still a practical one, they're frequently used to produce executable files.
Am I missing something here? Yes, if you use a feature that intentionally inserts the build time and date into the code, the every build is going to be different. That's the whole point of these macros. It's a feature. If you don't want that behavior, don't use that feature.
It's meant to be a trivial counterexample. Like saying "-1" to the claim "there's no number smaller than 0" to someone who's not familiar with math, the author is saying "build-dependent macros" to the claim "compilers are deterministic" to someone who might not be familiar with compilers.
But usually the realization follows the initial intent by several weeks, if not months! Your comment shines as the embodiment of hindsight is 20/20.
But that's exactly what I don't get. How can that be considered "accidental"? How can any thinking person not realize that putting the build time into the compiled image will make every build different because, you know, different builds happen at different times? Has software engineering really been dumbed down so much that this is not immediately obvious? It feels like a mechanic doing an oil change and being surprised by having all the oil drain out if they neglect to put the drain plug back in.
You might accidentally end up including it transitively and suddenly your binary is nondeterministic.
Isn't that true for web frameworks too? Usually they'll only target unix, but if they target windows and macos, then they work on those platforms too? Or am I misunderstanding what you're trying to say here?
If you update the OS, hardware, or compiler, you will see only few changes. If you update the web framework, you may see breakages, API deprecations or whatever. You may want to move to a different web framework entirely. TBH I don't really know, I don't know web programming beyond basic HTML/Javascript. That's what they say, though.
In the case of an desktop application, unless you build things against OS libraries, your "platform" is also typically a framework, like QT or AppKit or whatever you end up using. That's the equivalent of the "web framework" in the web world.
Basically, it goes "Your app > GUI framework > other/OS libraries" for desktop apps, "Your app > web framework > other/OS libraries" for web applications.
Then in both approaches you can of course skip the framework if you want, no one is forcing you to use those in either of the cases.
Edit: I realize now we might be talking past each other, I was under the understanding that "web framework" is about backend web frameworks, but maybe you actually meant frontend frameworks running client-side. If so, replace "other/OS libraries" with "browser runtime" and my comment more or less still makes sense :)
That's not what I consider "low level programming". I don't use any of these.
Yes you can do try and do plain Javascript. Honestly Javascript is a much less pleasurable environment than a compiled statically typed procedural language. The main advantage of the browser is you get a viewport, you get font rendering etc. with almost no setup required at all.
So say C linking to Xorg-libraries and drawing GUI that way isn't low level programming, then what is? Only assembly is "low level programming" or what?
Meh, JavaScript is fine, like most dynamic Algol/C-like languages. Could be worse, could be TypeScript :)
But personally, browser environment is a hell of a lot easier to target than doing cross-platform native application development, but I'm a web developer who started doing native apps, not the other way around, might be why.
1. Allows access in reasonable time/battery use to me on my phone
2. Poses any meaningful challenge to the most compute-resourced organizations on the planet
I wonder how many cumulative hours of human life have been wasted waiting on Anubis.
I disagree with a lot of the decisions around the design of Anubis... but resisting the current drive of the industry to ruin as much of the good faith resource donations from others is an admirable objective.
The point isn't to increase the amount of work required to the point of exhaustion, it's to require that scripts be able to offer the exact same feature set that browsers offer. The point isn't to make it impossible, it's too make it more expensive than free.
Anubis isn't trying to prevent all scraping, it's trying to reduce the abuse just enough that real requests get their fair share. You don't need to outcompute the botnet just slow them down a little.
I hate seeing the Anubis interstitial too, I've complained about it publicly already too. But it doesn't come close to the frustration of waiting 10s for an SPA to load all of the routes it'll never use before the first redraw. Clearly our industry has also decided latency is a good thing.
"How dare that mugging victim fight back".
The choice is not between Anubis and no Anubis, the choice is between Anubis and my website going offline because I can't afford the $400/month that AI scrapers would cost me (yes, I checked, and yes, that's the real figure) if Anubis wasn't in front.
If you then spam requests you might get another, harder, hallenge appear.
If you have a data center IP and look like bot traffic you get a hard challenge out the gate.
AFAIU after looking at their docs several months ago.
https://xeiaso.net/blog/2024/much-ado-about-nothing/
Compilers literally made your project possible!
I would consider that a bug tbh
And it may not have crossed their mind that the clang behavior is a bug after finding a workaround. I'd also assume compilers do things "no mere mortal can fully comprehend on their own".
I'll go file it upstream after work today.
llvm/test/CodeGen/WebAssembly/cfg-stackify-eh.ll and friends are existing tests that you can kinda mangle if you want to get a good reproducer.
Also take a look at https://discourse.llvm.org/t/reverse-iteration-bots/72224
Otherwise, happy to put my reproducer/patch on the bug after you file it!
The internal programming guide also says which collections to use for deterministic iteration order: https://llvm.org/docs/ProgrammersManual.html#llvm-adt-setvec...
So definitely a bug here.
https://reproducible-builds.org/docs/source-date-epoch/
(although Nix sets it as a default)
https://hacks.mozilla.org/2020/02/securing-firefox-with-weba...
And I speak as being generally very critical of cryptos, but here rewarding the website owner with some cents to have access seems fair, and resolves the traditional issues about micro-payments.
Wasn't there some famous home-computing project that recently stopped because of that? I thought it was Folding@home but that seems to still be going.
We see this with Recaptcha where when it was first launched, some news sites praised it as making good use of what would have otherwise been wasted human effort. But eventually I started to see negative comments along the lines of how Recaptcha is just extracting free work to train self driving cars, nevermind the part about stopping bots. Since Recaptcha is now sometimes non-interactive, I am not sure if that data is still used for training, other than to improve Recaptcha itself, but the negative sentiment still holds whether that data is used or not.
P.S. Great to see you, omoikane
Were Anubis to add crypto mining, even if all the revenue went to Techaro, you could still say "the enshittification is a shame, but at least they're not Google". Using the compute for BOINC protein folding somehow should be unobjectionable.
If it mined crypto instead of just burn clock cycles, then that could not in any way serve to lower its ranking in my book. It's already at minimum.
That being said I don't know any crypto that would technically fit as a lightweight PoW.
Do people really do that? -- disable, not just using old browsers with no wasm.
Disabling wasm while keeping js enabled is a configuration i can't understand
Why does he need to induce the reader in leap of logic with small characters.
https://xeiaso.net/characters/
But just taking this as-is, what is the environmental impact likely to be when multiplied up by the number of users? Proof of work is a bad idea.
Getting in the maze influences your client's challenge difficulty.
The README itselfs admit that this is an nuclear option. https://github.com/TecharoHQ/anubis
> I decided to take inspiration from the legendary talk The Birth and Death of JavaScript and just recompile the WebAssembly to JavaScript.
So what do you do when the client has Javascript disabled ?
Here, since any whatwg cartel web engine is an issue, the author should not bother.
I for one enjoyed the article and understand what you're getting at.
This is true but doesn't seem relevant; does replacing the word "compiler" with "build chain" change anything? Because that seems like the clear meaning.
If you want to have users trust that someone else hasn't modified it, then sign it with your identity.
Being able to reproduce the binary from the source code and being able to verify that it's the same as the original is quite important in some contexts.
I disagree. The contexts that people come up with are purely theoretical, and are not practically important. Please do try and convince me otherwise by sharing such a context. From my view the juice of trying to accomplish this is no where worth the squeeze.
Military context: a government would want to review the code and compile themselves. Provide a hash of the target binary to ensure they've compiled it correctly.
SDLC: provide auditors with _proof_ that the tested binary is indeed coming from the audited code
The government doesn't want to do this. A lot of the time the government doesn't even get the source code in the first place.
>provide auditors with _proof_ that the tested binary is indeed coming from the audited code
This can be done by showing to the auditor how one's CI is setup to build checked in code and sign it.
SDLC: Traceability is more important than reproducibility. Keeping logs is more important than deterministic build outputs
Why not build your own binaries and be done with that. If you don’t trust the compiler or the machine doing the build, just build the code yourself.
Also useful for checking that a binary containing GPLed code does actually correspond to its published source.
That tooling is a compiler. The higher level, the better chance the LLM can be steered to good output. Machine code is hopeless, don’t bother.
Just like the difference between 'him' and 'her' is inscrutable taken out of context, but that's why LLMs have embeddings they use to store contextual information in huge vectors and have an input processing phase during which the input tokens gain contextual information, so that the LLM knows that 'him' refers to 'Peter' and 'her' refers to 'Jane'. Likewise it will be able to infer that $+15 is the 'success' branch of control flow and $+16 is the fail branch.
The way computer programs and natural language differ, is that in language, words with absolute or at least very constrained meanings are common, while code, is basically a pure manipulation of symbols, with variable and function names being meaningless helpers, and the actual meaning needs to be deduced from the way these symbols are manipulated.
In fact, I think LLMs are actually surprisingly good at this kind of abstract symbol manipulation, and are far less bothered than humans with 'add rax, rcx' by the fact that the meaning of 'rax' and 'rcx' are heavily contextual, as they dedicate a lot of time to build up rich contextual information that might be different in every place these symbols appear.
The context is pretty flexible, like "Do you know Jim? I saw him at the store." Or, "Do you know Jim? Fifteen days ago, I saw him at the store." There’s a relatively small universe of pronouns (him, her, that, who, etc) and the pronouns refer to a token nearby (in this case, Jim).
With machine code, there’s a massive set of jump offsets, and the referent isn’t a token, but rather a location to start processing.
> In fact, I think LLMs are actually surprisingly good at this kind of abstract symbol manipulation,
When you’re manipulating machine code, you’ve stepped away from abstract symbol manipulation and you’re just manipulating byte values now.
I don’t think your argument here is convincing. Maybe you can point to a demo or some architecture where this works. But my sense is this—once you start designing a harness to make LLMs capable of writing machine code, or designing an architecture for LLMs to write machine code, something in your implementation probably looks like an assembler, and something in your internal tokenization of the machine code probably looks like a higher-level language.
Also there are dynamic compilers were the shape of machine code changes as the code executes, and each single execution will certainly generate different sequences, depending on the program execution and where it is running.
Deterministic JIT compiler code generation, at least on optimising ones, is not a solved problem.
You can have LLMs help you optimize code but I don’t think you can do this unattended for non-trivial code.
I don't see why that's the case. LLM trained on binary would totally see it, not?
Also the tool can also be running the test and a debugger.
It would not. You find the correct version by counting the number of bytes to the destination. LLMs are famously bad at this kind of problem (counting).
> Also the tool can also be running the test and a debugger.
The test needs to provide a good amount of signal. That’s too hard if you are throwing machine code at the wall.
In order for debuggers to work, you need some kind of model that describes what the code should do and what state the computer should be in after each instruction. That model is high-level code.
I can understand the intuitive appeal of training LLMs with machine code, but all of my experience with LLMs suggest that they are incredibly ill-suited to the task, and we just don’t have the capacity to train them to make useful machine code.
It then means you have 2 parties focussing on the big picture and no one focussing on the details.
Yesterday on a whim I tried asking a local model a question about kanji that look different in different fonts despite being the same character (to the point of strokes appearing in completely different directions), and the model hallucinated imgur links to images of the characters. If imgur could work with approximate references to data maybe that would have worked.
It applies to humans too. Calculus is “simple” but it takes something like sixteen years to train a human to do it, if all goes well. Meanwhile, most humans think that inverse kinematics is, like, the easiest thing in the world (it’s a super complicated task).
You’re only evaluating “harder” or “easier” based on the perspective of somebody who has a mammalian brain with millions of years of selective pressure to make it suitable for solving inverse kinematics problems.
The point here is that when we start constructing agents or tools with different architectures to ourselves, it makes sense to reevaluate notions of whether something is ‘hard’ or ‘easy’. LLMs are bad at counting not because counting is hard, but because their architecture makes it hard.
Also, I suspect you're comparing dissimilar things, because in one case you're looking at a brain doing both inverse kinematics and "calculus" (sense 1), and in the other you're looking at a computer doing both inverse kinematics and "calculus" (sense 2). The kind of calculus a CAS does is not the same kind that a human does. It's less versatile, for one.
>The point here is that when we start constructing agents or tools with different architectures to ourselves, it makes sense to reevaluate notions of whether something is ‘hard’ or ‘easy’.
Well, no, because when someone says that calculus is hard and moving their arms is easy, they're not talking about how hard it was to create each functionality, they're talking about how hard it is to employ each. We would need to ask a computer how hard it thinks the tasks it does are to do.
I don’t think the metric is at all reasonable, and the fact that it’s “objective” doesn’t make up for its other shortcomings. I don’t think we have a basis for agreement here—I think you’ve framed the argument in a way that supports a “calculus is hard” conclusion merely by defining “hard” in such a way that supports your conclusion from the start, but I think that approach is only useful as a way to win an argument, and we’ve failed to share ideas once you start using that tactic.
It seems to me you're the one who first did that by equivocating what is easier to do and what is easier to make a machine do.
>we’ve failed to share ideas once you start using that tactic
Well, I certainly don't agree with that.