Fuzz testing is basically generating random inputs (sometimes guided) trying to find ways to trigger failures, so it makes sense that inputs that don’t trigger errors are just counted as one more run that didn’t trigger any error (so no reason to output anything for that, because potentially you might run the fuzzer for a long time searching for a failure, so you wouldn’t be interested in non-interesting successful cases).
Also take a look at the web interface for the fuzzer (--fuzz without limit):
Which then can provide a more interactive experience, presenting the statistics and showing green/red dots for the files indicating what has been explored by the fuzzer.
I am not super familiar with the fuzzer, but it seemed like you might have fallen into a routine always passing a limit to --fuzz and thus maybe being unaware of the webui, so I wanted to mention it, in case that’s true.
Agreed. And normally (including some successful fuzz testing I’ve done), this works beautifully. Let’s call that the “steady state” use of fuzz testing. But I have two unusual objectives right now: 1) learning enough to document some things (and wanting to instrument to that end, rather than just trying to understand via the code), including edge cases, and 2) wanting to instrument in order to properly construct my fuzz setup. #1 could be considered pretty unusual, and one could say I simply need to edit my local std and remove @disableInstrumentation() (if that’s all that’s going on here… and it seems it may be), but #2 may be a normal journey for a coder to take. It seems that wielding the smith to produce what you need is not always going to be easy enough to do right by inspection, without a little debugging. Programmers are used to using simple traces to do fast poor-man’s debugging, but this doesn’t seem possible here. Perhaps there should be a flag to indicate whether you’d like instrumentation turned off or not, which defaults to the current behavior, but, when authoring some fuzz code, a coder could turn it off in order to see what’s going on?
So yes, for this special purpose, I’ve been using limit=10 or 100 or 1000 and wanting to see the nature of the fuzzing I’ve constructed. For my real-world case, it’s much more interesting than this std-example, but even the std-example case turns up at least one interesting case – the “long chains of 0 values” case – which may be suspicious (a bug?), but may also just be in the nature of how fuzz values are generated from bitstreams (and just considered worthwhile waste)… I’m still trying to understand enough to get to the bottom of that question. If it’s “to be expected”, I’d like to understand that, too, so that, indeed, when a future coder considers my documentation, and it mentions this, they can move along and not get distracted by it.
And also, thank you for the link to that valuable thread discovering more about @disableInstrumentation(). One of the merges suggest that it’s nearly essential, to avoid real trouble with premature access to thread-local storage, but once that pitfall is past, and the testing is underway, it seems that motivation could disappear. So, perhaps an approach would be to document a scope-local @enableInstrumentation() (I see that no such thing currently exists, and this isn’t surprising - since fuzzing runs as its own process, it’s fine for instrumentation to be left off indefinitely) - this would allow usercode to re-enable it in order to instrument. Another option might be to have a --fuzz-test, which would only run 10 or so iterations, by default, and would be single-threaded, and would be for userland fuzz code instrumenting/debugging purposes. It would be imperfect - not executed exactly as the “real” execution of a million iterations, but might be sufficient for the coder to get his code “right”, doing the right thing syntactically.
I don’t understand what you are trying to say regarding #1 it sounds like you want to remove @disableInstrumentation() to then fuzz the fuzzer to better understand it, I find that a bit odd and relatively unlikely to bring you valuable insights, the way I see it, is that it is there so that the fuzzer doesn’t waste its time to fuzz itself and instead the actual code you decided you want to fuzz.
If you include a whole bunch of code in the fuzzing that needs to be traversed before the fuzzer gets to the actual code, than you make it less likely to find something you are interested in and when a normal user writes a fuzz test they aren’t looking for bugs in the fuzzer.
I don’t think you actually want to fuzz the fuzzer every time, it might make more sense to have more normal unit tests for the fuzzer to make sure all the pieces work and validate its implementation that way.
Basically I don’t think fuzzing is a tool that needs to be used for every problem.
Especially testing the fuzzer may be better done by other means, than running the fuzzer on itself, because trying to do that you would have to solve awkward bootstrapping problems which basically don’t bring you lots of value for the user.
With the compiler it makes sense to implement in Zig so that you can use the language to develop the language, with the fuzzer we already can use Zig, so I don’t see much value in over-complicating it.
I think it is still valuable to have a separation between the code that does the fuzzing and the code that should be fuzzed, so I don’t really see it as a pitfall and I haven’t seen anything that suggest that it is a pitfall.
Long chains of 0 are something that happens in software / are often a possible input so it makes sense that it would be fuzzed.
I discovered something of value - the fuzz test is run once when you do a normal test run (not --fuzz). This is helpful for debugging your fuzz code… or it would be if it weren’t for the tendency to get long strings of 0s at the start of the fuzz data. Normal data comes later, but it means that loops often don’t enter at all if they’re eosWeightedSimple()-based like the sample code, as the most common scenario is an immediate eos (i.e., while(false)). But, it’s a promising bit. I think the decision to run a fuzz through one iteration for “normal” test run is a good one for this purpose. If you can make that “work” to debug fuzz testing, then having instrumentation turned off (always) during the real fuzz testing is natural.
No, sorry for the confusion; I can see how it would come from two different goals of mine. But here I mainly meant: one’s fuzz function implementation might not easily be super simple, and might need some whittling, to get it to do what you want (run your code the path you intend). But I agree with you:
In fact, what I found is that your fuzzer test will run once through in a normal test run, so you can instrument it all you want. However, you have to be aware that some boolean (esp. eos) decisions are fairly likely to come up ‘false’ on the first shot, so an outer path/entry might not enter, and you won’t see anything that happens inside. Again, though, you can easily work around that with a little care. It’s just a matter of being aware, and now I am. (I actually only ran into this while trying to understand the “sample” silly fuzz example that zig init spits out; my own real code wasn’t constructed that way, and a single run is more than enough to see that everything is running correctly for my particular fuzz case, so subsequent --fuzz runs do just what I expect. It’s quite fantastic. And I’m traveling and using a very old laptop, but still, the way it whizzes through a million tests is impressive indeed.)
Right, but what’s unexpected is that, given the zig init example, for instance, which contains a smith.eosWeightedSimple(7, 1) eos, you’d expect approximately 7x more liklihood of a true, thus entering the loop (at least once) than a false, but my instrumentation (which I could get to show up by coersion) shows that the first 50 or so calls yield falses. Thereafter, you get the proper (roughly 7:1 ratio) of trues to falses, until a thousand or two iterations down the road, when you get another long string of 0s again, and so forth. It’s an anomaly that might be a bug, and, as you’ve mentioned, isn’t really important, because: “who cares if the first 50 tests don’t run, you’re fuzzing, so you’re doing a million, and it’s not even likely that you set up your fuzz that way, so it’s not even likely to “fail” in that way anyway, so… just move along.” I think this is approximately true, though I might be curious enough to look deeper at it… or file an issue if I can make a tidy reproducible. Anyway…
Ha! So it is. Moreover, that was the older zig init boilerplate; the new is entirely different, and doesn’t depend on eosWeightedSimple() at all (it does use eos()). I wonder if false - true was once the other way around. Anyway, no matter. Sometime, though, I’ll have to try to reproduce the windows of long zeros… they only showed up every thousand or so tries (or maybe it was 10000), but they were pretty visible. Anyway, nothing else interesting on this front at present, I think.
Wait, this does renew my curiosity, though, and may expose another misunderstanding. Many of the std calls to eosWeightedSmple are like that old init-generaged code: (large, small). (grep -rnw '.' -e 'eosWeightedSimple') - like lib/std/deque.zig:641. Why, in that bit if code, would you want the while loop to never even enter once 15 times more commonly than letting it run just once. Isn’t the result in this case just a few single-iteration runs through the while loop (multiplied by a million or so fuzz iterations, of course)? I would expect that the context would like to survive a few iterations, then let the outer fuzz loop click forward, and so on. Am I missing something? I would expect, when used as a loop weight like this, at least, that the pattern would always be (small, large), in order to inspire some reasonably normal number of loops (like 15-ish, with a chance of 0 or 7 or 25 perhaps). But that way-round is (much) less common… or, perhaps, nonexistent, by my quick survey. (But there are uses that aren’t this kind of loop-trigger sort, too.)
Anyway, there’s a good chance I’m missing an important bit of understanding about eos here.
I think you are over-looking/-reading that the return value is negated with ! so there it is actually 15 times more likely to continue, than stop the loop
However I am unsure why it isn’t written without the negation with the weights swapped, would be interesting to know whether there is a subtle important difference between those.
I, too, am curious about whether eosWeightedSimple(a, b) is more or less the same as !eosWeightedSimple(b, a), and why one would prefer to hide an explanation exclamation point for me to miss!
If it’s ridiculous that I hijack this thread for related followup, please, mods, feel free to fork it.
Question: why not make std.testing.Smith.constructInput()pub? Here’s my angle: when noodling some test code for the fuzzer to run, it’s nice that a normal test run (even zig test, without zig build) runs the given fuzz function once through. However, normally the smith’s .in is not conducive to a run-through that exercises much of the code (in some basic cases I’m playing with). So, I’m forced to a bit of contrivance. If I could just sample-feed, via constructInput(), as is done in some basic concept testing in std.testing.Smith, I could get a single run that is more representative of “normal” (actually exercising more of the code), then I could be more confident when firing off the fuzzer.
Sounds like you’re describing what FuzzInputOptions.corpus is meant for: ideally, it’s a set of minimized inputs that exercise a diverse set of paths in the code, to be used as a starting point for input generation.
Note: I believe the corpus option is from before the fuzzer was changed to be smith-based. It may be in a weird state at the moment since the inputs are no longer direct inputs.