Unsigned Sizes: A Five Year Mistake

(c3-lang.org)

16 points | by lerno 53 minutes ago

7 comments

EdSchouten 1 minute ago
> If sizes are unsigned, like in C, C++, Rust and Zig – then it follows that anything involving indexing into data will need to either be all unsigned or require casts.
I don’t really get this claim. Indexing should just look up the element corresponding to the value provided. It’s easy to come up with semantics that are intuitive and sound, even if signed integers or ones smaller than size_t are used.
Groxx 12 minutes ago
>If sizes are unsigned, like in C, C++, Rust and Zig – then it follows that anything involving indexing into data will need to either be all unsigned or require casts. With C’s loose semantics, the problem is largely swept under the rug, but for Rust it meant that you’d regularly need to cast back and forth when dealing with sizes.
TBH I've had very little struggle with this at all. As long as you keep your values and types separate, the unsigned type that you got a number from originally feeds just fine into the unsigned type that you send it to next. Needing casting then becomes a very clear sign that you're mixing sources and there be dragons, back up and fix the types or stop using the wrong variable. It's a low-cost early bug detector.
Implicitly casting between integer types though... yeah, that's an absolute freaking nightmare
kevin_thibedeau 31 minutes ago
Systems programmers love to hate on unsigned integers. Generations have been infected with the Java world model that integers have to be pretend number lines centered on zero. Guess what, you still have boundary conditions to deal with. There are times when you really really need to use the full word range without negative values. This happens more often with low level programming and machines with small word sizes, something fewer people are engaged in. It doesn't need to be the default. Ada has them sequestered as modular types but it's available to use when needed.
[-]
- uecker 24 minutes ago
  Having them available is not the issue, using them for sizes and indices is what causes a lot of tricky bugs.
- einpoklum 3 minutes ago
  > There are times when you really really need to use the full word range without negative values.
  There are a few of those, but that is the niche case. Certainly when we're talking about 64-bit size types. And if you want to cater to smaller size types, then just just template over the size type. Or, OK, some other trick if it's C rather than C++.
LegionMammal978 18 minutes ago
> But what about the range? While it’s true that you get twice the range, surprisingly often the code in the range above signed-int max is quite bug-ridden. Any code doing something like (2U * index) / 2U in this range will have quite the surprise coming.
Alas, (2S * signed_index) / 2S will similarly result in surprises the moment the signed_index hits half the signed-int max. There's no free lunch when trying to cheat the integer ranges.
ks2048 18 minutes ago
I know language designers have a lot of trade-offs to consider... But I would say if you know a value will logically always be >= 0, better to have a type that reflects that.
The potential bugs listed would be prevented by, e.g. "x--" won't compile without explicitly supplying a case for x==0 OR by using some more verbose methods like "decrement_with_wrap".
The trade-off is lack of C-like concise code, but more safe and explicit.
ximm 13 minutes ago
Is the text on this page really #bbbdc3 on #ffffff? How is anyone supposed to be able to read that?
jonstewart 7 minutes ago
I hate using languages that only have signed integers. Using integers that can’t be negative fits many problems nicely and avoids the edge case of having to check for negative.