(Not intended as criticism, just drilling stonily into a couple details:)

Properly, 0 to UCHAR_MAX for the normal return range (which is usually 255 because CHAR_BIT usually ==8, but you occasionally get 16 in embedded and there was a 9-bit beastie with a 36-bit word amongst the earliest layer of C-targeted ISAs—wanna say Honeywell Something, maybe?), and hypothetically EOF needn’t be −1, it just usually is and traditionally was. Per the C standards it must be negative if the environment is hosted or pre-C99, so INT_MIN would also be reasonable.

int variables are always signed, or they wouldn’t be int; unsigned[int] is its own type unto itself, which would make the variable not-int. Now, once upon a time you could kinda split the diff

typedef int foo;
unsigned foo bar;

but this isn’t a thing in C per se any more, although some compilers still understand it with protest.

EOF < 0 isn’t necessarily the reason int is used—if char is signed, it can certainly store −1—but properly it’s that int is wider than char. E.g., were it spec-legal, you could easily use EOFINT_MAX and never encounter any problems, except dinging printf’s maximum output length.

And in any event,

unsigned k;
while((k=getc(stdin)) != EOF && k != '\n')
    putc(k, stdout);
putchar('\n');

will work just as well as it would with int k.

It might raise a linty sort of warning for the signed-unsigned conversion (→!= 0U+EOF if so), but (int)(unsigned)EOF = EOF so it’s fine even if your compiler gets the promotion wrong. (EOF will be silently promoted counter to an unsigned operand; -1UINT_MAX, assuming two’s-complement, which is a given from C23 on.)

I also personally think (k=getc(stdin)) <= UCHAR_MAX is kinda defensively swell, or even keen, as an alternative to þe olde != EOF; all erroneous and disallowed values are excluded that way.

Cast the value to unsigned char, or assign to a separate unsigned char variable and pass that, or use C99 qualifier hh%02hhX, or mask with 255U and cast to int, or mask with 255U and pass through an int variable.

The elements don’t all have to be the same type across space (e.g., you can have ints next to floats), but what matters is that any non-bytelike type you access it as remain consistent during the entirety of the arena memory’s lifetime. This is because the compiler expects alias-compatibility; just as you can’t pun an int variable directly to a float, you can’t access an allocated region directly as an int, then a float, even with a call to your release routine, without having escaped from C per se in the process.

malloc and free work because the language standards make them Special, and there’s no satisfactory way to write them in pure C to where freeing something actually wipes type information (if there is any). Any free that doesn’t go by the exact, extern-linkage internal identifier free is not Special.

There are ways around this, of course. If LTO isn’t a thing, then you (like generations before you) can rely on the optimizer’s inability to see through TU boundaries, and ensure that your allocate and release routines live in their own TU(s) to guard against alias-smashing. If LTO might be a thing, you can either disable it as a one-off (e.g., __attribute__((__optimize__("-fno-lto"))) or #pragma $ETC optimize $ETC) or strip LTO data out between compile and link.

If those aren’t options, you can memcpy everything in and out, or if you’re sure it’ll be C99 and no C++ you can set up a union of all pertinent types and access through that.

Arrays and functions pass indirectly, even when typedef’d. This is why setjmp works as a function accepting jmp_buf—the latter is an array type, typically.

They do different things. You can qualify any field access, and it’s how you disambiguate when base classes use overlapping names. So e.g., given

struct A {int x;} a;
struct B {int x;};
struct C : A, B {) c;

a.x is equivalent to a.A::x, and because c.x might refer either to A or B’s field, you must either use c.A::x or c.B::x to select one or the other.

So :: is a namespace qualifier, and . is the action of referencing a member of an actual object, and ditto for ->.

:: also comes into play with member pointers. E.g., int A::* is the type of a pointer to an int field relative to an A base type (basically a wrapper for ptrdiff_t, although member function pointers can be considerably more complicated.

Languages like Java that don’t quad-dot tend not to have C/++’s type syntax and typename weirdness, and that enables them to syntactically disambiguate type and package names from field, variable, and method names where they overlap. Java also lacks multiple inheritance of C++’s sort, so there’s less reason to support a distinct separator. If you had to do x.A.y.B.z to disambiguate you’d have no idea whether A and B are classes or constant fields maybe, and then overlap between class and field names could cause chaos.

Back in C++, you can often get around the need for :: by using using, which is a bad idea globally but perfectly fine inside a function body. Alternatively, there are countless ways to avoid repeating :: name prefixes, even if you have to use a macro or function.

Java references, Python variables/fields, and C pointers are basically the same thing, and a combination of java.lang.reflect.Method references and virtual methods give you function pointers in Java. (Python more-or-less first-classes functions, so they work like anything else.) Of course, Java and Python give you pointers to structs/arrays of a sort, but not pointers into them; you may need to pair object-references with indices or field names(/Java Fields) to achieve the same granularity as C.

Alternatively, you can usually simulate pointers as array indices. In a shell script sans array, you can implement pointers by concatenating integers onto variable names and poking/peeking thattaway, or (shudder) using filenames and softlinks as the pointers they are.

If all you have is one big integer, you can treat it as an array by divving/multiplying to shift elements, modding to isolate them, and adding to fill in an all-zero element.

So there’s always a way to do pointery things, if the language is remotely useful.

The question has been answered, but oh wow, that pointer comparison is fully Bad.

You shouldn’t compare pointers with <, <=, >, or >=, or subtract them, unless you’re quite sure they’re aimed at the same underlying object (no actual requirement for that here); otherwise it’s undefined behavior, and the implementation is within its rights to replace a comparison with a constant, even if that means p > q && p < q, or it contradicts p == q. Direct comparisons == and != are always safe, however.

In practice, the most “‘“portable”’” way to compare pointers is either through memcmp, which is, semantically speaking, definitely not what you want, but sometimes useful for sorting; or through a cast to uintptr_t, if it’s available, which it needn’t be.

Comparison between uintptr_ts is well-defined in all cases, but unfortunately both pointer representation in situ and conversion between pointer and integer formats is implementation-specified. While most ABIs will just hand the pointer’s bits off directly, there’s no language-level promise that char *p converting to uintptr_t k necessarily implies (uintptr_t)(p+1) == k+1, and therefore integer comparisons are only actually meaningful when everything happens to be flat-mapped, and there are no weird holes in the address format like you get with segment-spanning “huge” models.

E.g., if you’re slumming it on an AS/400 (god forbid), full-fledged pointers are 128-bit by default, but there are no 128-bit integer types to use for comparison :(, and part of a pointer’s representation is effectively a segment (object? IBM’s IP is 50% glossary, so IDR their term) ID. z/Arch, MCS-86, iAPX286, and IA-32 may also use __far pointer types (which may or may not be the default) that include a segment field, although modern IA-32 code mostly doesn’t use segmentation, and it’s quasi-vestigial under x64.

Anyway, on one of these beasties you can compare just the offsets, and that’ll be fine as long as the segments match. For flat-mode IA-32, this is the case—CS, SS, and DS are all aimed at the same virtual address range, although CS technically needs its own segment separate from SS/DS, and certain no-execute kludges may reduce CS’s limit. But the bases all line up. (FS and GS are still used separately, however, primarily for TLS and to speed up system calls.)

But if your input pointers’ segments don’t match, you effectively have no idea what relation the two pointers have to each other unless you can perform the address translation yourself, and you often can’t. It’s quite possible that only the OS knows where things are, and it won’t tell you, nyaah. However, if you can assume that segments you have C-wise access to don’t overlap, and that C objects are restricted to a single segment (not a sound assumption on DOS or OS/2, which will gladly allocate contiguous runs of 64-KiB segments for you), for memmove purposes you can assume the pointer ranges in different segments don’t overlap, and perform your copy at toppest speed.

Another issue that can arise is when dealing with function pointers, because they’re slippery. (As in, like an eel, not a slipper. You might semireasonably consider memmoveing function contents during loading, for example, when you’re shuffling code around and may even need to self-relocate.) Function pointers will coerce to void * and back without protest, but in both directions the conversion is implementation-specific, and therefore the “back” pointer needn’t match the original. Codeybytes might not be visible at all, or might require special instructions to access; but fortunately in C per se, functions are fully abstract, and have no data to copy from, so a generic memmove doesn’t need to care.

Only if you need to load seg regs ;)

KDevelop and Eclipse CDT are other ones. KDevelop is barer-bones; CDT’s clunky and its parser tends to be a bit behind-the-times (e.g., its preprocessor really should treat -1ULL > 0 as true, but doesn’t, but you can detect it with defined __CDT_PARSER__—CLion, JetBrains, and Intellisense also have their own parser macros, BTW, and older Intellisense would let you use stopfiles to detect it—so your code can actually react to the editor, to some extent) but it has a tits macro-thingy (mmmmostly works, and lets you trace each step of expansion), and it’s otherwise as capable as a normal IDE. IIRC both have debuggers built in.

Tecccccccχᵪχᵪχᵪχcccchnically Linux leaves it up to the filesystem driver—e.g., V/-FAT is not case-sensitive by default, but ext2/3/4(/5?/6? do we have a 6 yet?) and most others are. Often case-handling is configured at mount time, so it’s mostly up to Mr. Root (ﷺ) in practice.

Fun fact: DOS, Windows, WinNT, and various older UNIXes also have a rather terrifying situation regarding filename (and sometimes pathname) truncation.

Ideally, attempting to access an overlong file- or pathname should raise an error (e.g., ENAMETOOLONG), but various OSes will silently lop off anything beyond the limits and sally glibly forth as if nothing were wrong. DOS, DOSsy Windows, and AFAIK older NT truncate filenames; DOS also truncates extensions, so myspoonistoobig.com-capture.htm might become myspooni.com, which is distinctly unsettling.

Modern NT doesn’t truncate filenames at least, and IIRC modern POSIX requires the NOTRUNC option (indicating an API-level promise to return an error if an erroneous input is fed in), but older systems may require you to check functionality for individual paths with f-/pathconf, or might just not tell you at all whether truncation will occur (iow, FAFO and one-offery are the only detection methods).

However, everything must be twice as complicated as it ought to be when you’re Microsoft, and therefore NT pathnames support resource fork names or WETF MS calls them (Apple called them that on HFS IIRC, at least), and those do still truncate silently.

Seeing as to how most stuff just uses files and directories or container formats when it wants forkyness, I assume fucking nothing outside MS’s own software, malware, and MS’s own malware uses this feature. —I mean, I know the forkjobbies are used regardless, but not named in any explicit fashion. In any event, as long as an attacker doesn’t control pathnames too directly it shouldn’t matter. Just another small hole left open, and the terse “Caution: Holes (Intentional)” sign at the entrance to the park will surely suffice to keep tourists from sinking their ankle in and faceplanting.

So you run Linux on a separate computer and mount your Windows stuff on it?

Also, lol@

it came pre-built with windows and I don't want to try and go against its design philosophy

Just LOL

I thought you could pop it out on some BSDs, similar to a signal-return. Could be off.

VLAs are perfectly legal C, just in poor taste.

Treat it like a new box of Legos or something; when you come upon an unfamiliar system, dump all the pieces out onto the floor and start fucking around until you get a feel for how things work or somebody kicks you out of the museum.

Truly original is the easy part, because concept-spaces tend to be very large and dense. Correct and useful is the hard part, and our modern math and science derive from problems and urges that wouldn’t arise for an AI, or cause it to “select” itself onto a useful track.

C threading shouldn’t be used until C17 specifically (201710), and Pthreads or Windows threads are preferable if offered because they give you actual control over stuff (C threads are …threadbare) and can interact cleanly with other OS structures and actual hardware, which C threads make effectively no promises about. C threads are merely the baseline for functionality that must be offered by a modern, multiprocessor-capable C implementation.

You can detect Pthreads by detecting POSIX.1, which requires you to have #included<unistd.h> with or _POSIX_C_SOURCE predefined ≥like 199509L or something but just use 200809L, and then _POSIX_VERSION+0 must be defined to >=198808L (which doesn’t get you 1003.1…c or d or whicheveritis that introduces threading, but it doesn’t matter), and then specifically defined _POSIX_THREADS && _POSIX_THREADS+0 >=0 if the API is at least supported, fully >0 if it’s definitely implemented, not stubbed in. !defined or <0 means definitely not supported.

It’s not uncommon for a subset of the <pthread.h> APIs to be offered even when POSIX isn’t fully supported, especially on embedded and esoteric systems, although the presence of the header (e.g., as indicated by __has_include or __has_include__, latter one being offered by GCC 5–9 in all modes) might not signify actual support—often you’ll need specific compiler settings or libraries to be added in conjunction with the header at build time, and if you aren’t reasonably sure the build-user wants you to use threading, you should fall back to something else.

For Windows threads you detect WinAPI (IIRC all of Win32S, WinCE, Win32, or Win64 runtimes) with (gesture) all that.

Another good option is OpenMP (_OPENMP+0 >= YYYYMML), with which you can #pragma omp parallel for or task your way onto worker threads, call into extern functions therefrom, and move on with life. Newer OMP (post-4.0 IIRC) may also give you GPU offloading.

You can approximate multithreading on a single thread by scheduling asynchrinous callbacks which perform one nonblocking or time-bounded operation, schesule any future actions, and return. (This is a stackless implementation of m-on-n pthread_yield.) If you can avoid blocking I/O, do so; e.g., postpone things connected to character device, socket, FIFO, or other special file and do your I/O only when necessary to avoid starving the blocking operation, or use an OS-specific IOCP/select/poll/mplex call to probe files, or if the OS offers nonblocking calls you can just deal with the spinnies, or you can set a timer signal that interrupts the I/O operation. ucontext and potentially signal handling goop can be used to switch stacks↔fibers.

BSD systems offer a vfork call that can be used to roughly, violently approximate threads, and most UNIXes will let you fork and share memory, or issue the same thread- or LWP-creation system calls you would via Pthreads.

Note that it’s permitted for strcpy to check its args, and it may even be possible with enough info carried through from source to executable. If the compiler inlines it, it is being checked to some extent.

It would just be unrealistically expensive for it to check args with any specificity, in most situations, and since the checks would only be worthwhile to compensate for ~required checks being left out of the codebase, it’s only when you’re emulating hard (e.g., Valgrind) or have built with the appropriate sanitizers that you’re likely to see any response beyond glitch, crash, or subversion as a result of bogus strcpy args.

String handling routines are also one of the individually higher-performance parts of the standard library (although they don’t combine cleanly enough for them to be generally appropriate in high-perf routines), and a bunch of extra kernel-runtime-filesystem-debuginfo-ABI interactions would be needed to do any strict checking well enough for secure situations.

Copies a fixed number of bytes from one object to another, never between overlapping ranges—if they might overlap, memmove instead.

Generally you either have or need the string and buffer lengths separate from the content, so you keep separate size_t variables or fields with them, and track to avoid overflow and ensure safety as you go. strlen for when you genuinely don’t know the length of a string (rare—primarily at OS/library boundaries), sizeof for when it’s right there. E.g., to copy from a string constant to a string:

#define setstr_const(STR, LIT)\
    memcpy(STR, LIT, sizeof("" LIT ""))

char buffer[32] = "????????";
setstr_const(buffer, "Oh, hello, there!");
memcpy(buffer+4, "Othello"+2, 4);
puts(buffer);

It’s only marginally safer than strcpy, though—not relying on the string conrents as authority on their own length is certainly a useful property of a safe-ish string-handling function, but it has the same restrictions on null and overlapping arguments as strcpy, which is to say, it makes no promises if you give it a null argument, even for a zero-sized copy.

(This differs from the behavior of the archetypal implementation:

void *memcpy(void *restrict dst, const void *restrict src, size_t nbytes) {
    void *const ret = dst; char *p = dst; const char *q = src;
    while(nbytes--) *p++ = *q++;
    return ret;
}

Note: Give that a different name if you paste it, b/c the optimizer may optimize it as a jump to memcpy—i.e., an infinite loop. It specifically won’t break if null,null,0 is passed, but real memcpy can.)

memcpy is easier to inline and potentially much more performant than strcpy once you’ve got a length. strcpy can act on blocks of characters at a time if your CPU is sufficiently ℱancy, but block i+1 can’t safely be copied until it’s determined that blocks 0 through i don’t contain NUL, and that will occur only once the memory for all of those blocks makes it into the CPU or its streaming-engine stand-in, and that latency is half of the bottleneck for a large copy operation. (The other half is the fact that bringing a new cache line in must generally wait for an old line to be evicted, and if you’re using cache for a large copy you’ll come back cold.)

If you use a smallish constant length, the compiler can usually convert a memcpy call to load/store/move instructions, which makes it a legal way to circumvent alias analysis or alignment restrictions:

float x = 3.1415926F; uint_least32_t y;
_Static_assert(sizeof(x) == sizeof(y), "size masmitch");
memcpy(&y, &x, sizeof y);
printf("%f <=> 0x%08X\n", x, y);

memcpy supplies the length extrinsically, which unlike strcpy’s intrinsic lengthinng permits removal of an $if(!*x) branch out of the copy loop and makes it possible to copy in any order whatsoever, or even lets it complete after the function returns (e.g., on secondary thread) as long as that’s not visible at the application level.

memmove on overlapping ranges limits it to only forwards or backwards, which is why it’s a separate function. On nonoverlapping ranges (you can’t detect overlap portably from C per se, don’t try), memmove is just a slllightly deferred thunk to memcpy—overlap checks will take a few extra instructions.

Generally, any full-fledged project will need a string library of some sort to be constructed or fetched; as for malloc, you generally wrap up memset and memcpy in higher-order operations, so the best API to use is really your own, as appropriate for context.

“The language is always wrong” would be more appropriate, given the number of Total Design Oopsies over the years.

JS and Java and C# and even Python are compiled, just later than C. (C is recompiled to some extent by the CPU, as is any machine code produced by the interpreter engine.) It’s all the same blasted thing placeshifted or timeshifted by a few centimeters.

An omni-union augmented by ISA and ABI one-off detection is as close as you reasonably get without soft-coding to some extent. Many C11-supportive compilers do define a macro in non-C11 modes, however—e.g., GCC, Clang, and IntelC generally define __BIGGEST_ALIGNMENT__, and C11 just uses that. I think that’s been around since at least GCC 4.4ish. But there aren’t really that many promises for that—it doesn’t change even if the ISA adds new instructions with higher alignments, because old stuff would break, so it’s really only good for basic malloc usage.

I don’t know of anything with a >16-byte scalar CPU data type, and scalar, non-_BitInt types are all that’re strictly required for malloc’s return value, so hardcoding 16 is uncouth but unlikely to cause problems in practice since practical allocators require various nonportable assumptions for their basic functioning.

For other stuff you may have to detect instruction subset (e.g., SSE or AVX vectors may need 16+-byte align, depending) and overalign at allocation, or use CPUID to get the cache line size, which is what’s best for interthread work. (But the specific optimal line size might depend on the exact hardware threads involved, because those determine the appropriate level of sharing.) Page size is required for interprocess work.

So rounding the omni-union size up to the next power of two gets you a baseline alignment (that much can be done at compile time via enum bank), and you might need to bump that power of two up by a couple notches at run time.

Alternatively, just take an alignment as a -D-defined macro, and default to the omni-union if it’s not defined. Each new version of the C language adds some auto-detectable stuff that prior versions either made detection very complicated for, required explicit config for, or required testing for with a dummy build.