484 points by jedeusus 257 days ago | 161 comments

nopurpose 257 days ago [-]

Every perf guide recommends to minimize allocations to reduce GC times, but if you look at pprof of a Go app, GC mark phase is what takes time, not GC sweep. GC mark always starts with known live roots (goroutine stacks, globals, etc) and traverse references from there colouring every pointer. To minimize GC time it is best to avoid _long living_ allocations. Short lived allocations, those which GC mark phase will never reach, has almost neglible effect on GC times.

Allocations of any kind have an effect on triggering GC earlier, but in real apps it is almost hopeless to avoid GC, except for very carefully written programs with no dependenciesm, and if GC happens, then reducing GC mark times gives bigger bang for the buck.

liquidgecka 257 days ago [-]

Its worth calling out that abstractions can kill you in unexpected ways with go.

Anytime you use an interface it forces a heap allocation, even if the object is only used read only and within the same scope. That includes calls to things like fmt.Printf() so doing a for loop that prints the value of i forces the integer backing i to be heap allocated, along with every other value that you printed. So if you helpfully make every api in your library use an interface you are forcing the callers to use heap allocations for every single operation.

slashdev 257 days ago [-]

I thought surely an integer could be inlined into the interface, I thought Go used to do that. But I tried it on the playground, and it heap allocates it:

https://go.dev/play/p/zHfnQfJ9OGc

masklinn 257 days ago [-]

Go did use to do that, it was removed years ago, in 1.4: https://go.dev/doc/go1.4#runtime

kbolino 257 days ago [-]

Basically, anything that isn't a thin pointer (*T, chan, map) gets boxed nowadays. The end result is that both words of an interface value are always pointers [1], which is very friendly to the garbage collector (setting aside the extra allocations when escape analysis fails). I've seen some tricks in the standard library to avoid boxing, e.g. how strings and times are handled by log/slog [2].

[1]: https://github.com/teh-cmc/go-internals/blob/master/chapter2...

[2]: https://cs.opensource.google/go/go/+/refs/tags/go1.24.1:src/...

257 days ago [-]

ncruces 256 days ago [-]

slog.Value looks incredibly useful.

Just imagine a day where database/sql doesn't generate a tonne of garbage because it moves to use something like that?

ominous_prime 257 days ago [-]

go1.15 re-added small integer packing into interfaces: https://go.dev/doc/go1.15#runtime

masklinn 257 days ago [-]

It didn't, actually. Instead go 1.15 has a static array of the first 256 positive integers, and when it needs to box one for an interface it gets a pointer into that array instead: https://go-review.googlesource.com/c/go/+/216401/4/src/runti...

This array is also used for single-byte strings (which previously had its own array): https://go-review.googlesource.com/c/go/+/221979/3/src/runti...

ominous_prime 257 days ago [-]

It didn't, do what? I would consider the first 256 integers to be "small integers" ;)

> Converting a small integer value into an interface value no longer causes allocation

I forgot that it can also be used for single byte strings, That's not an optimization I ever encountered being useful, but it's there!

masklinn 257 days ago [-]

> It didn't, do what?

Reintroduce “packing into interfaces”.

It did a completely different thing. Small integers remain not inlined.

MarkMarine 257 days ago [-]

Are you including in this analysis the amount of time/resources it takes to allocate? GC isn't the only thing you want to minimize for when you're making a high performance system.

nopurpose 257 days ago [-]

From that perspective it boils down to "do less", which is what any perf guide already includes, allocations is just no different from anything else what app do.

My comment is more about "reduce allocations to reduce GC pressure" advice seen everywhere. It doesn't tell the whole story. Short lived allocation doesn't introduce any GC pressure: you'll be hard pressed to see GC sweep phase on pprof without zooming. People take this advice, spend time and energy hunting down allocations, just to see that total GC time remained the same after all that effort, because they were focusing on wrong type of allocations.

MarkMarine 257 days ago [-]

Yeah I understand what you’re saying, but my point is you’re doing the opposite side of the same coin. Not doing full perf analysis and saying this one method works (yours is to reduce GC mark time, ignoring allocation, others are trying to reduce allocation time, ignoring GC time, or all these other methods listed in this doc.)

zmj 257 days ago [-]

Pretty similar story in .NET. Make sure your inner loops are allocation-free, then ensure allocations are short-lived, then clean up the long tail of large allocations.

neonsunset 257 days ago [-]

.NET is far more tolerant to high allocation traffic since its GC is generational and overall more sophisticated (even if at the cost of tail latency, although that is workload-dependent).

Doing huge allocations which go to LOH is quite punishing, but even substantial inter-generational traffic won't kill it.

aktau 257 days ago [-]

Side note: see https://tip.golang.org/doc/gc-guide for more on how the Go GC works and what triggers it.

GC frequency is directly driven by allocation rate (in terms of bytes) and live heap size. Some examples:

  - If you halve the allocation rate, you halve the GC frequency.
  - If you double the live heap size, you halve the GC frequency (barring changes away from the default `GOGC=100`).

> ...but if you look at pprof of a Go app, GC mark phase is what takes time, not GC sweep.

It is true that sweeping is a lot cheaper than marking, which makes your next statement:

> Short lived allocations, those which GC mark phase will never reach, has almost neglible effect on GC times.

...technically correct. Usually, this is the best kind of correct, but it omits two important considerations:

  - If you generate a ton of short-lived allocations instead of keeping them around, the GC will trigger more frequently.
  - If you reduce the live heap size (by not keeping anything around), the GC will trigger more frequently.

So now you have cheaper GC cycles, but many more of them. On top of that, you have vastly increased allocation costs.

It is not a priori clear to me this is a win. In my experience, it isn't.

deepsun 256 days ago [-]

Interesting, thank you. But I think those points are not correlated that much. For example if I create unnecessary wrappers in a loop, I might double the allocation rate, but I will not halve the live heap size, because I did not have those wrappers outside the loop before.

Basically, I'm trying to come up with an real world example of a style change (like create wrappers for every error, or use naked integers instead of time.Time) to estimate its impact. And my feeling is that any such example would affect one of your points way more than the other, so we can still argument that e.g. "creating short-lived iterators is totally fine".

nopurpose 257 days ago [-]

I enjoyed your detailed response, it adds value to this discussion, but I feel you missed the point of my comment.

I am against blanket statements "reduce allocations to reduce GC pressure", which lead people wrong way: they compare libraries based on "allocs/op" from go bench, they trust rediculous (who allocates 8KB per iteration in tight loop??) microbenchmarks of sync.Pool like in the article above, hoping to resolve their GC problem. Spend considerabe amount of effort just to find that they barely moved a needle on GC times.

If we generalize then my "avoid long-lived allocations" or yours "reduce allocation rate in terms of bytes" are much more useful in practice, than what this and many other articles preach.

kgeist 256 days ago [-]

The runtime also forces GC every 2 minutes. So yeah, a lot of long living allocations can stress the GC, even if you don't allocate often. That's why Discord moved from Go to Rust for their Read States server.

zbobet2012 257 days ago [-]

E.h., kind of. If you are allocating in a hot loop it's going to suck regardless. Object pools are really key if you want high perf because the general purpose allocator is way less efficient in comparison.

ncruces 257 days ago [-]

The point is not to avoid GC entirely, but to reduce allocation pressure.

If you can avoid allocs in a hot loop, it definitely pays to do so. If you can't for some reason, and can use sync.Pool there, measure it.

Cutting allocs in half may not matter much, but if you can cut them by 99% because you were allocating in every iteration of a 1 million loop, and now aren't, it will make a difference, even if all those allocs die instantly.

I've gotten better than two fold performance increases on real code with both techniques.

bboreham 257 days ago [-]

Agree that mark phase is the expensive bit. Disagree that it’s not worth reducing short-lived allocations. I spend a lot of time analyzing Go program performance, and reducing bytes allocated per second is always beneficial.

felixge 257 days ago [-]

+1. In particular []byte slice allocations are often a significant driver of GC pace while also being relatively easy to optimize (e.g. via sync.Pool reuse).

raggi 257 days ago [-]

You might wanna look at a system profiler too, pprof doesn't show everything.

Capricorn2481 257 days ago [-]

Aren't allocations themselves pretty expensive regardless of GC?

nu11ptr 257 days ago [-]

Go allocations aren't that bad. A few years ago I benchmarked them at about 4x as expensive as a bump allocation. That is slow enough to make an arena beneficial in high allocation situations, but fast enough to not make it worth it most of the time.

aktau 257 days ago [-]

Comparing with a fairly optimized malloc at $COMPANY, the Go allocator is (both in terms of relative cycles and fraction of cycles of all Go programs) significantly more expensive than the C/C++ counterpart (3-4x IIRC). For one, it has to do more work, like setting up GC metadata, and zeroing.

There have recently been some optimizations to `runtime.mallocgc`, which may have decrease that 3-4x estimate a bit.

nu11ptr 257 days ago [-]

How can that be true? If it is 3-4x more expensive than malloc, then per my measurements your malloc is a bump allocator, and that simply isn't true for any real world malloc implementation (typically a modified free list allocator afaik). `mallocgc` may not be fast, but I simply did not find it as slow as you are saying. My guess is it is about as fast as most decent malloc functions, but I have not measured, and it would be interesting to see a comparison (tough to do as you'd need to call malloc via CGo or write one in C and one in Go and trust the looping is roughly the same cost).

aktau 256 days ago [-]

I should correct and clarify: I meant 3-4x more expensive in relative terms. Meaning:

  - For C++ programs, the allocator (allocating+freeing) consumes roughly 5%  of cycles.
  - For Go programs, the allocator (runtime.mallocgc) used to consume ~20% of cycles (this is the data I referenced). I checked and recently it's become closer to 15%, thanks to optimizations.

I have not tested the performance differential on a per-byte level (though that will also differ with object structure in Go).

epcoa 257 days ago [-]

No. If you have a moving multi generational GC, allocation is literally just an increment for short lived objects.

burch45 257 days ago [-]

This is about go not Java. Go makes different tradeoffs and does not have moving multigenerational GC.

257 days ago [-]

pebal 257 days ago [-]

If you have a moving, generational GC, then all the benefits of fast allocation are lost due to data moving and costly memory barriers.

gf000 257 days ago [-]

Not at all. Most objects die young and thus are never moved. Also, the time before it is moved is very long compared to CPU operations so it is only statistically relevant (very good throughput, rare, longer tail on latency graphs).

Also, write-only barriers don't have that big of an overhead.

pebal 257 days ago [-]

It doesn't matter if objects die young — the other objects on the heap are still moved around periodically, which reduces performance. When you're using a moving GC, you also have additional read barriers that non-moving GCs don't require.

gf000 256 days ago [-]

Is that period really that big of a concern when your threads in any language might be context switched away by the OS? It's not a common occurrence on a CPU-timeline at all.

Also, it's no accident that every high-performance GC runtime went the moving, generational way.

pebal 256 days ago [-]

That time may seem negligible, since the OS can context switch threads anyway, but it’s still additional time during which your code isn’t doing its actual work.

Generations are used almost exclusively in moving GCs — precisely to reduce the negative performance impact of data relocation. Non-moving GCs are less invasive, which is why they don’t need generations and can be fully concurrent.

gf000 256 days ago [-]

I would rather say that generations are a further improvement upon a moving collector, improving space usage and decreasing the length of the "mark" phase.

And which GC is fully concurrent? I don't think that's possible (though I will preface that I am no expert, only read into the topic on a hobby level) - I believe the most concurrent GC out there is ZGC, which does read barriers and some pointer tricks to make the stop-the-world time independent of the heap size.

pebal 256 days ago [-]

Java currently has no fully concurrent GC, and due to the volume of garbage it manages and the fact that it moves objects, a truly fully concurrent GC for this language is unlikely to ever exist.

Non-moving GCs, however, can be fully concurrent — as demonstrated by the SGCL project for C++.

In my opinion, the GC for Go is the most likely to become fully concurrent in the future.

gf000 256 days ago [-]

Is SGCL your project?

In that case, are you doing atomic writes for managed pointers/the read flag on them? I have read a few of your comments on reddit and your flags seem to be per memory page? Still, the synchronization on them may or may not have a more serious performance impact than alternative methods and without a good way to compare it to something like Java which is the state of the art in GC research we can't really comment much on whether it's a net benefit.

Also, have you perhaps tried modeling your design in something like TLA+?

pebal 256 days ago [-]

Yes, SGCL is my project.

You can't write concurrent code without atomic operations — you need them to ensure memory consistency, and concurrent GCs for Java also rely on them. However, atomic loads and stores are cheap, especially on x86. What’s expensive are atomic counters and CAS operations — and SGCL uses those only occasionally.

Java’s GCs do use state-of-the-art technology, but it's technology specifically optimized for moving collectors. SGCL is optimized for non-moving GC, and some operations can be implemented in ways that are simply not applicable to Java’s approach.

I’ve never tried modeling SGCL's algorithms in TLA+.

257 days ago [-]

pgwhalen 256 days ago [-]

It’s in uncharitable to say the benefits are lost - I’d reframe it as creating tradeoffs.

257 days ago [-]

deepsun 256 days ago [-]

Interesting, and I think that is not specific to Go, other mark-and-sweep GCs (Java, C#) should behave the same.

Which means that creating short lived objects (like iterators for loops, or some wrappers) is ok.

ted_dunning 256 days ago [-]

Not entirely. Go still doesn't have a generational collector so high allocation rates cause more GC's that must examine long-lived objects.

As such, short-lived objects have little impact in Java (thank god for that!). They will have second order effects in Go.

int_19h 256 days ago [-]

It should be noted that in C#, at least, the standard pattern is to use value types for enumerators, precisely so as to avoid heap allocations. This is the case for all (non-obsolete) collections in the .NET stdlib - e.g. List<T>.Enumerator is a struct.

nurettin 257 days ago [-]

Is it worth making short lived allocations just to please the GC? You might just end up with too many allocations which will slow things down even more.

aktau 257 days ago [-]

It is not. Please see my answer (https://news.ycombinator.com/item?id=43545500).

stouset 257 days ago [-]

Checking out the first example—object pools—I was initially blown away that this is not only possible but it produces no warnings of any kind:

    pool := sync.Pool{
        New: func() any { return 42 }
    }

    a := pool.Get()

    pool.Put("hello")
    pool.Put(struct{}{})

    b := pool.Get()
    c := pool.Get()
    d := pool.Get()

    fmt.Println(a, b, c, d)

Of course, the answer is that this API existed before generics so it just takes and returns `any` (née `interface{}`). It just feels as though golang might be strongly typed in principle, but in practice there are APIs left and rigth that escape out of the type system and lose all of the actual benefits of having it in the first place.

Is a type system all that helpful if you have to keep turning it off any time you want to do something even slightly interesting?

Also I can't help but notice that there's no API to reset values to some initialized default. Shouldn't there be some sort of (perhaps optional) `Clear` callback that resets values back to a sane default, rather than forcing every caller to remember to do so themselves?

ncruces 257 days ago [-]

This is still strong typing, even it it's not static typing.

It's static vs. dynamic and strong vs. weak.

https://stackoverflow.com/a/11889763

9rx 257 days ago [-]

It is strong, static, and structural. But structural typing is effectively compile-time duck typing, so it is understandable that some might confuse it with dynamic typing.

masklinn 256 days ago [-]

Ggp is not talking about structural typing, but about sync.Pool type erasing (it takes `any` values, and returns `any` values). So you can put (and will retrieve) random garbage from it.

9rx 256 days ago [-]

> Ggp is not talking about structural typing,

Okay... but the reply was to the parent who questioned if the typing is static. It is by way of structural typing. The compiler enforces that the ducks, so to speak, are compatible. But as an empty interface has no constraints, all types are compatible. Whatever some other comment was talking about is irrelevant.

> but about sync.Pool type erasing

The type isn't erased...?

    p := sync.Pool{New: func() interface{} { return 1 }}
    fmt.Printf("%T", p.Get()) // Prints: int

> So you can put (and will retrieve) random garbage from it.

And you think that makes it dynamically typed? It does not.

zaphodias 257 days ago [-]

While I think you're right (generics might be useful there), it's fairly easy to wrap the `sync` primitives such as `sync.Pool` and `sync.Map` into your specific use case.

Go is pretty strict about breaking changes, so they probably won't change the current implementations; maybe we'll see a v2 version, or maybe not. The more code you have, the more code you have to maintain, and given Go's backward-compatibility promises, that's a lot of work.

aktau 257 days ago [-]

Upstream thinks a type-safer `sync.Pool` is a good idea too. It's being discussed in https://go.dev/issue/71076.

Someone 257 days ago [-]

> While I think you're right (generics might be useful there), it's fairly easy to wrap the `sync` primitives such as `sync.Pool` and `sync.Map` into your specific use case.

That’s not a strong argument. You can easily (but sometimes tediously) wrap any API with one that (further) restricts what types you can use with it. Generics make it possible to avoid doing that work, and code you don’t write won’t have errors.

zaphodias 257 days ago [-]

Don't get me wrong, I agree! Especially performance-wise, I'd love to have the best primitives that let me build whatever I want and not some very generic primitives that perform a bit worse and I have to tune myself so I don't shoot myself in the foot.

strangelove026 257 days ago [-]

Sync.map is meant to have poor performance I believe

https://github.com/golang/go/issues/21031

PhilippGille 257 days ago [-]

It depends on the use case.

From the Godoc:

> The Map type is optimized for two common use cases: (1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys. In these two cases, use of a Map may significantly reduce lock contention compared to a Go map paired with a separate Mutex or RWMutex.

Source: https://pkg.go.dev/sync#Map

And regarding slow writes, those were recently improved in Go 1.24:

> The implementation of sync.Map has been changed, improving performance, particularly for map modifications. For instance, modifications of disjoint sets of keys are much less likely to contend on larger maps, and there is no longer any ramp-up time required to achieve low-contention loads from the map.

Source: https://go.dev/doc/go1.24#minor_library_changes ("sync" section)

jlouis 257 days ago [-]

It is fairly common your type system ends up with escape hatches allowing you to violate the type rules in practice. See e.g., OCaml and the function "magic" in the Obj module.

It serves as a way around a limitation in the type system which you don't want to deal with.

You can still have the rest of the code base be safe, as long as you create a wrapper which is.

The same can be said about having imperative implementations with functional interfaces wrapping said implementation. From the outside, you have a view of a system which is functionally sound. Internally, it might break the rules and use imperative code (usually for the case of efficiency).

stouset 257 days ago [-]

Obviously every type system in practice has escape hatches. But I’ve never seen another staticly-typed language where you need to break out of the type system so regularly.

Go’s type system has your back when you’re writing easy stuff.

But it throws up its hands and leaves you to fend for yourself when you need to do nearly anything interesting or complex, which is precisely when I want the type system to have my back.

I should not have to worry (or worse, not worry and be caught off guard) that my pool of database connections suddenly starts handing back strings.

int_19h 256 days ago [-]

> But I’ve never seen another staticly-typed language where you need to break out of the type system so regularly.

It's about the same as Java and C# prior to their adoption of generics, and largely for the same reasons.

jfwwwfasdfs 257 days ago [-]

A lot of languages have top types

tgv 257 days ago [-]

You never programmed in Go, I assume? Then you have to understand that the type of `pool.Get()` is `any`, the wildcard type in Go. It is a type, and if you want the underlying value, you have to get it out by asserting the correct type. This cannot be solved with generics. There's no way in Java, Rust or C++ to express this either, unless it is a pool for a single type, in which case Go generics indeed could handle that as well. But since Go is backwards compatible, this particular construct has to stay.

> Also I can't help but notice that there's no API to reset values to some initialized default.

That's what the New function does, isn't it?

BTW, the code you posted isn't syntactically correct. It needs a comma on the second line.

gwd 257 days ago [-]

> That's what the New function does, isn't it?

But that's only run when the pool needs to allocate more space. What GP seems to expect is that sync.Pool() would always return a zeroed structure, just as Golang allocation does.

I think Golang's implementation does make sense, as sync.Pool() is clearly an optimization you use when performance is an issue; and in that case you almost certainly want to only initialize parts of the struct that are needed. But I can see why it would be surprising.

> [any] is a type

It's typed the way Python is typed, not the way Rust or C are typed; so loses the "if it compiles there's a good chance it's correct" property that people want from statically typed languages.

I don't use sync.Pool, but it does seem like now that we have generics, having a typed pool would be better.

9rx 257 days ago [-]

> so loses the "if it compiles there's a good chance it's correct" property that people want from statically typed languages.

If that's what people actually wanted, Coq and friends would be household names, not the obscure oddities that they are. All the languages that people actually use on any kind of regular basis require you to write tests in order to gain that sense of correctness, which also ends up validating type-correctness as a natural byproduct.

"A machine can help me refactor my code" is the property that most attracts people to the statically typed languages that are normally used. With "I can write posts about it on the internet" being the secondary property of interest.

gwd 257 days ago [-]

It's a spectrum, with costs and benefits at each level. I lock my front door even though I don't have bars on my windows; I prefer Golang, where doing a basic compile will catch a fair number of errors and testing will catch the rest, to Python or Perl where testing is the only way to catch errors.

9rx 257 days ago [-]

> where doing a basic compile will catch a fair number of errors

In the case of refactoring this is incredibly useful. It doesn't say much about the correctness of your program, though.

stouset 257 days ago [-]

> But that's only run when the pool needs to allocate more space. What GP seems to expect is that sync.Pool() would always return a zeroed structure, just as Golang allocation does.

Not quite that. Imagine I have a pool of buffers with a length and capacity, say when writing code to handle receiving data from the network.

When I put one of those buffers back, I would like the next user of that buffer to get it back emptied. The capacity should stay the same, but the length should be zero.

I think it’s reasonable to have a callback to do this. One, it doesn’t force every consumer of the pool to have to remember themselves; it’s now a guarantee of the system itself. Two, it’s not much work but it does prevent me from re-emptying freshly-allocated items (in this case reinitialzing is fast, but in some cases it may not be).

This also should be an optional callback since there are many cases where you don’t want any form of object reset.

ignoramous 257 days ago [-]

> What GP seems to expect is that sync.Pool() would always return a zeroed structure, just as Golang allocation does.

One could define a new "Pool[T]" type (extending sync.Pool) to get these guarantees:

  type Pool[T any] sync.Pool  // typed def

  func (p *Pool[T]) Get() T { // typed Get
      pp := (*sync.Pool)(p)
      return pp.Get().(T)
  }

  func (p *Pool[T]) Put(v T) { // typed Put
      pp := (*sync.Pool)(p)
      pp.Put(v)
  }

  intpool := Pool[int]{        // alias New
      New: func() any { var zz int; return zz },
  }

  boolpool := Pool[bool]{      // alias New
      New: func() any { var zz bool; return zz },
  }

https://go.dev/play/p/-WG7E-CVXHR

9rx 257 days ago [-]

> One could define a new "Pool[T]" type (extending sync.Pool) to get these guarantees:

So long as that one is not you? You completely forgot to address the expectation:

    type Foo struct{ V int }
    pool := Pool[*Foo]{ // Your Pool type.
        New: func() any { return new(Foo) },
    }

    a := pool.Get()
    a.V = 10
    pool.Put(a)

    b := pool.Get()
    fmt.Println(b.V) // Prints: 10; 0 was expected.

ignoramous 257 days ago [-]

> You completely forgot to address the expectation

> fmt.Println(b.V) // Prints: 10; 0 was expected.

Sorry, I don't get what else one expects when pooling pointers to a type? In fact, pooling *[]uint8 or *[]byte is common place; Pool.Put() or Pool.Get() then must zero its contents.

9rx 257 days ago [-]

> I don't get what else one expects when pooling pointers to a type?

As seen in your previous comment, the expectation is that the zero value will always be returned: "What GP seems to expect is that sync.Pool() would always return a zeroed structure, just as Golang allocation does." To which you offered a guarantee.

> Pool.Put() or Pool.Get() then must zero its contents.

Right. That is the solution (as was also echoed in the top comment in this thread) if one needs that expectation to hold. But you completely forgot to do it, which questions what your code was for? It behaves exactly the same as sync.Pool itself... And, unfortunately, doesn't even get the generic constraints right, as demonstrated with the int and bool examples.

ignoramous 257 days ago [-]

> And, unfortunately, doesn't even get the generic constraints right, as demonstrated with the int and bool examples.

If those constraints don't hold (like you say) it should manifest as runtime panic, no?

> What GP seems to expect is that sync.Pool() would always return a zeroed structure

Ah, well. You gots to be careful when Pooling addresses.

> But you completely forgot to do it, which questions what your code was for?

OK. If anyone expects zero values for pointers, then the New func should return nil (but this is almost always useless), or if one expects values to be zeroed-out, then Pool.Get/Put must zero it out. Thanks for the code review.

9rx 257 days ago [-]

> If those constraints don't hold (like you say) it should manifest as runtime panic, no?

No. Your int and bool pools run just fine – I can't imagine you would have posted the code if it panicked – but are not correct.

> I did not forget?

Then your guarantee is bunk: "One could define a new "Pool[T]" type (extending sync.Pool) to get these guarantees:" Why are you making claims you won't stand behind?

ignoramous 257 days ago [-]

It was a blueprint. Embedding and typedefs are ways to implement these guarantees. And of course, writing a generic pool library is not what I was after.

> but are not correct.

I don't follow what you're saying. You asserted, "And, unfortunately, doesn't even get the generic constraints right, as demonstrated with the int and bool examples." What does it even mean? I guess, this bikeshed has been so thoroughly built that the discussion points aren't even getting through.

9rx 257 days ago [-]

> What does it even mean?

Values are copied in Go. Your code will function, but it won't work.

You've left it up to the user of the pool to not screw things up. Which is okay to some degree, but sync.Pool already does that alone, so what is your code for?

ignoramous 257 days ago [-]

> Values are copied in Go

Gotcha. Thanks for clearing it up.

> so what is your code for?

If that's not rhetorical, then the code was to demonstrate that sync.Pool could be "extended" with typedefs/embeds + custom logic. Whether it got pooling itself right was not the intended focus (as shown by the fact that it created int & bool pools).

9rx 257 days ago [-]

> then the code was to demonstrate that sync.Pool could be "extended" with other types and custom logic.

Wherein lies the aforementioned guarantee? The code guarantees neither the ask (zero values) nor even proper usage if you somehow didn't read what you quoted and thought that some kind of type safety was the guarantee being offered.

Furthermore, who, exactly, do you think would be familiar enough with Go to get all the other things right that you left out but be unaware of that standard, widely used feature?

ignoramous 257 days ago [-]

> Wherein lies the aforementioned guarantee?

I think you should re-read what I wrote. You seem to be upset that I did not solve everyone's problem with sync.Pool with my 10 liner (when I claimed no such thing).

  One could define a new "Pool[T]" type (extending sync.Pool) to get these guarantees

Meant... One could define / extend sync.Pool to get those guarantees [for their custom types] ... Followed by an example for int & bool types (which are copied around, so pooling is ineffective like you say, but my intention was to show how sync.Pool could be extended, and nothing much else).

9rx 257 days ago [-]

> I think you should re-read what I wrote.

You "forgot" to copy the colon from the original statement. A curious exclusion given the semantic meaning it carries. Were you hoping I didn't read your original comment and wouldn't notice?

> You seem to be upset

How could one ever possibly become upset on an internet forum? Even if for some bizarre and unlikely reason you were on the path to becoming upset, you'd turn off the computer long before ever becoming upset. There is absolutely no reason to use this tool if it isn't providing joy.

> One could define / extend sync.Pool to get those guarantees [for their custom types] ...

What audience would be interested in this? Is there anyone who understands all the intricacies of sync.Pool but doesn't know how to define types or how to write functions?

ignoramous 256 days ago [-]

> You "forgot" to copy the colon from the original statement.

You got me!

257 days ago [-]

tgv 257 days ago [-]

> What GP seems to expect is that sync.Pool() would always return a zeroed structure

Might be, but that's a design decision that has nothing to do with type or generics, isn't it? You seem to refer to a function to drain the pool, which is not needed, and frankly, rather unusual.

> It's typed the way Python is typed

Not in the slightest.

> "if it compiles there's a good chance it's correct"

If you want to compare it to something, it's more like Rust's unwrap(), which will panic if you apply it to the wrong result.

gwd 257 days ago [-]

> Not in the slightest.

You know, it's this kind of comment on USENET forums which prompted the creation of StackOverflow. It's not curious and adds nothing to the discussion.

I like Go and use it extensively; and I like having the option to fall back to the `any` type. But it's simply a fact that using the `any` type means that certain properties of the program can't be checked at compile time, in the same way that Python isn't able to check certain properties of the program at compile time.

> If you want to compare it to something, it's more like Rust's unwrap(), which will panic if you apply it to the wrong result.

Rust's unwrap() is used when a type can have one of exactly two underlying types (which is why no type is specified). In this case, getting back an `any` type means the underlying type could literally be anything -- as demonstrated by the example, where they put an integer, a string, and an empty struct into the pool. That's almost certainly not what you wanted, but the compiler won't prevent you from doing it.

9rx 257 days ago [-]

> But it's simply a fact that using the `any` type means that certain properties of the program can't be checked at compile time

Yes, structural typing removes the ability to check certain properties at compile-time. That doesn't make it typed like Python, though.

int_19h 256 days ago [-]

"any" is not structural typing.

9rx 256 days ago [-]

any isn't a special type. It is an alias for interface{}.

The empty set is trivially satisfied by all types, but obviously can be narrowed as you see fit.

int_19h 252 days ago [-]

I know, but having a type that is a supertype of everything is still not structural typing. Structural typing naturally has such a type, yes, whereas with nominal type systems it has to be explicitly introduced - but that still happens often enough, e.g. object in Python or System.Object in C#.

OP's point was that `any` in Go is much akin to duck typing in Python because in both cases the type check happens at runtime at the moment you're about to use the object. Yes, in Go, technically there's a separate step of downcasting to some interface type before you invoke the members, and it's that downcasting which fails, but c'mon, this is a pedantic distinction without a meaningful difference in practical usage scenarios of `any` to work around language limitations. The relevant thing is that it has all the same downsides.

tgv 257 days ago [-]

Sorry, but comparing Python's total absence of typing to extracting a value from any is quite weird.

> certain properties of the program can't be checked at compile time

Neither can you check if a number is positive or negative, or if a string is empty or not at compile time, but that doesn't make Go similar to COBOL or Forth. `var v any` declares v to be of the type any, not of any arbitrary type, which is what Python does. Writing `v + 1` gives a compiler error, unlike Python, which may or may not turn it into a runtime error. It is rather different, and especially so when you look at interfacing. Even though you may declare a variable to be an integer in Python, there is no guarantee that it actually is, whereas in Go that is the case, which has significant implications for how you handle e.g. json.

> the compiler won't prevent you from doing it.

It will prevent you from using e.g. an array of strings as if it were an array ints. Python does not. They are quite different.

> You know, it's this kind of comment on USENET forums which prompted the creation of StackOverflow. It's not curious and adds nothing to the discussion.

Ffs.

sophacles 256 days ago [-]

Python doesn't have "total absense of typing". It doesn't have static typing, so compile time checks are not possible (well historically, there's some psuedo static typing things these days). The fact that you can call `+` on some objects but not others is literally the result of the objects being different types.

A truly typeless language (or maybe more accurately single type for everything language) is ASM, particularly for older CPU designs. You get a word - the bitfield in a register, and can do any operation on it. Is that 64 bits loaded from an array of characters and the programmer intended it to be a string? Cool you can bitwise and it with some other register. Was it a u64, a pointer, a pair of u32s? Same thing - the semantics don't change.

tgv 256 days ago [-]

The language itself is practically devoid of restrictions on data types. There only are standard lib functions that check the type of their arguments.

zaphodias 257 days ago [-]

I assume they're referring to the fact that a Pool can hold different types instead of being a collection of items of only one homogeneous type.

eptcyka 257 days ago [-]

Is there a time in your career where an object pool absolutely had to contain an unbounded set of types? Any time when you would try know at compile time the total set of types a pool should contain?

pyrale 257 days ago [-]

> There's no way in Java, Rust or C++ to express this either

You make it look like it's a good thing to be able to express it.

There's no way in Java, Rust or C++ to express this, praised be the language designers.

As for expressing a pool value that may be multiple things without a horrible any type and an horrible cast, you could make an union type in Rust, or an interface in Java implemented by multiple concrete objects. Both ways would force the consumer to explicitly check the value without requiring unchecked duck typing.

int_19h 256 days ago [-]

> There's no way in Java, Rust or C++ to express this, praised be the language designers.

That's not even the case. In Java, you'd just use Object, which is for all practical purposes equivalent to `interface{}` aka `any` in Go. And then you downcast. Indeed, code exactly like this was necessary in Java to work with collections before generics were added to the language.

In C++, there's no common supertype, but there std::any, which can contain a value of any type and be downcast if you know what the actual type is.

sophacles 256 days ago [-]

Rust has an Any type. It's rarely useful, but there are occasionally situations where a heterogeneous collection is the right thing to do. Casting the any type back to actual type is fairly nice though, as the operation returns an Option<T> and you're forced to deal with the case where your cast is wrong.

tgv 256 days ago [-]

> You make it look like it's a good thing to be able to express it.

No, just that this pre-generics Go, and backwards compatibility is taken seriously.

gf000 257 days ago [-]

How is it different than pre-generic Java?

Map/List<T> etc are erased to basically an array of Objects (or a more specific supertype) at compile-time, but you can still use the non-generic version (with a warning) if you want and put any object into a map/list, and get it out as any other type, you having to cast it as the correct type.

sapiogram 257 days ago [-]

> You never programmed in Go, I assume?

You might want to step off that extremely high horse for a second, buddy. It's extremely reasonable to expect a type-safe pool that only holds a single type, since that's the most common use case.

kevmo314 257 days ago [-]

Zero-copy is totally underrated. Like the site alludes to, Go's interfaces make it reasonably accessible to write zero-copy code but it still needs some careful crafting. The payoff is great though, I've often been surprised by how much time is spent allocating and shuffling memory around.

jasonthorsness 257 days ago [-]

I once built a proxy that translated protocol A to protocol B in Go. In many cases, protocol A and B were just wrappers around long UTF-8 or raw bytes content. For large messages, reading the content into a slice then writing that same slice into the outgoing socket (preceded and followed by slices containing the translated bits from A to B) made a significant improvement in performance vs. copying everything over into a new buffer.

Go's network interfaces and slices makes this kind of thing particularly simple - I had to do the same thing in Java and it was a lot more awkward.

roundup 257 days ago [-]

Additionally...

- https://go101.org/optimizations/101.html

- https://github.com/uber-go/guide

I wish this content existed as a model context protocol (MCP) tool to connect to my IDE along w/ local LLM.

After 6 months or switching between different language projects, it's challenging to remember all the important things.

jigneshdarji91 257 days ago [-]

Additionally... - https://www.uber.com/en-AU/blog/how-we-saved-70k-cores-acros...

This has saved Uber a lot of money on compute (I'm one of the devs). If your compute fleet is large and has memory to spare (stateless), performing dynamic GOGC tuning to tradeoff higher memory utilization for fewer GC events will save quite a lot of compute.

TechDebtDevin 257 days ago [-]

Embedding those docs in your MCP server takes about 5 seconds with mcp-go's AddResource method

https://github.com/mark3labs/mcp-go/blob/main/examples/every...

255 days ago [-]

jrockway 257 days ago [-]

GOMEMLIMIT has saved me a number of times. In containerized production, it's nice, because sometimes jobs are ephemeral and don't even do enough allocations to hit the memory limit, so you don't spend any time in GC. But it's saved me the most times in CI where golangci-lint or govulncheck can't complete without running out of memory on a kind-of-large CI machine. Set GOMEMLIMIT and it eventually completes. (I switched to nogo, though, so at least golangci-lint isn't a problem anymore.)

donatj 257 days ago [-]

Unpopular opinion maybe, but sync.Pool is so sharp, dangerous and leaky that I'd avoid using it unless it's your absolute last option. And even then, maybe consider a second server first.

infogulch 257 days ago [-]

A new sync/v2 NewPool() is being discussed that eliminates the sharp edges by making it generic: https://github.com/golang/go/issues/71076

I haven't personally found it to be problematic; just keep it private, give it a default new func, and be cautious about only putting things in it that you got out.

nasretdinov 256 days ago [-]

I think in general people understand that sync.Pool introduces essentially an equivalent of unitialised memory (since objects aren't required to be cleaned up before returning them to the pool), and mostly use it for something like []byte, slicing it like buf[0:0] to avoid accidentally reading someone else's memory.

But the instrument itself is really sharp and is indeed kind of last resort

parhamn 257 days ago [-]

Noticed the object pooling doc, had me wondering: are there any plans to make packages like `sync` generic?

arccy 257 days ago [-]

eventually: https://github.com/golang/go/issues/71076

dennis-tra 257 days ago [-]

Can someone explain to me why the compiler can’t do struct-field-alignment? This feels like something that can easily be automated.

CamouflagedKiwi 257 days ago [-]

Because the order of fields can be significant. It's very relevant for syscalls, and is observable via the reflect package; it'd be strange if the field order was arbitrarily changed (and might change further between releases).

I assume the thinking was that this is pretty easy to optimise if you care, and if it's on by default there'd then have to be some opt-out which there isn't a good mechanism for.

9rx 257 days ago [-]

> and if it's on by default there'd then have to be some opt-out which there isn't a good mechanism for.

Good is subjective, but the mechanism is something already implemented: https://pkg.go.dev/structs#HostLayout

CamouflagedKiwi 256 days ago [-]

Oh interesting, I've not encountered that before - I suppose because it is currently the default behaviour.

What I was hoping not to find (and fortunately didn't!) was one of Go's magical syntactic comments.

kbolino 257 days ago [-]

In particular, struct field alignment matches C (even without cgo) and so any change to the default would break a lot of code.

9rx 256 days ago [-]

> struct field alignment matches C (even without cgo)

The spec defines alignment for numeric types, but that's about it. There is nothing in the spec about struct layout. That is implementation dependent. If you are relying on a particular implementation, you are decidedly in unsafe territory.

> so any change to the default would break a lot of code.

The compiler can disable optimization on cgo calls automatically and most other places where it matters are via the standard library, so it might not be as much as you think. And if you still have a case where it matters, that is what this is for: https://pkg.go.dev/structs#HostLayout

kbolino 256 days ago [-]

That's good to know. I'm not making use of this assumption, but the purego package (came from ebitengine) does. It looks like they're aware of HostLayout [1] but I'm not sure how many other people have gotten the memo (HostLayout didn't exist before Go 1.23).

[1]: https://github.com/ebitengine/purego/issues/259

arp242 256 days ago [-]

> There is nothing in the spec about struct layout

De-facto a lot of programs rely on it, so whatever the spec says is irrelevant.

Not just for cgo by the way, but also things like binary.Read()/Write().

9rx 256 days ago [-]

> but also things like binary.Read()/Write().

binary.Read/Write already uses reflect as far as I can tell, so it wouldn't matter in that case. There is an optimization for numeric values in there, which may be what you are thinking of? But they are specified in the spec and that's not what we're talking about anyway (and if an optimization really had to go, for whatever reason, it wouldn't be the end of the world).

Did you mean something else?

kbolino 254 days ago [-]

I had to think about this one a bit.

Basically, binary.Read/binary.Write are capable of reading/writing struct values. The worry would be that, if the Go compiler reordered fields under the hood, the order may differ between writing the data and reading it back (especially across different versions of Go).

However, because these functions use reflection, they likely wouldn't be affected. While the in-memory layout of the struct fields might get reordered, presumably the reflection order would match the declaration order. Indeed, there is already an Offset field on the reflect.StructField type, and there is no statement anywhere I can find that such offsets must increase monotonically.

So, the fields would remain in declaration order when inspected with reflection, but their offsets could jump around within the struct, yet well behaved reflection-based code should be agnostic to this change.

int_19h 256 days ago [-]

This is very unfortunate, since most structs are never going to be passed to C, yet end up paying the tax anyway. They really should have made it opt-in.

masklinn 256 days ago [-]

It can. Rust does.

That requires a way to opt out tho, because there are situations where you need a specific field ordering, so now the langage needs to provide way to tune struct compilation behaviour.

9rx 257 days ago [-]

Like the answer to all "Why doesn't Go have X?" questions: Lack of manpower. There has been some work done to support it, but is far from complete. Open source doesn't mean open willingness to contribute, unfortunately. Especially when you're not the cool kid on the block.

__turbobrew__ 256 days ago [-]

Calling mmap “zero copy” is generous. I guess we glaze over the whole page fault thing, or the fact that performance is heavily dependent on how much memory pressure the process is under.

This is the same n00b trap that derailed the llama.cpp project last year because people don’t understand how memory maps and paging works, and the tradeoffs.

neillyons 257 days ago [-]

Curious to know what people are building where you need to optimise like this? eg Struct Field Alignment https://goperf.dev/01-common-patterns/fields-alignment/#avoi...

dundarious 257 days ago [-]

False sharing is an absolutely classic Concurrency 101 lesson, nothing remarkable about it.

kubb 257 days ago [-]

Something that shouldn’t be written in a GC language.

Cthulhu_ 257 days ago [-]

GC is not relevant in this case, it's about whether you can make structs fit in cache lines and CPU registers. Mechanical sympathy is the googleable phrase. GC is a few layers further away.

piokoch 257 days ago [-]

I don't think GC has anything to do here, doing manual memory allocation we might hit the same problem.

EdwardDiego 257 days ago [-]

Huh, this surprises me about Golang, didn't realise it was so similar to C with struct alignment. https://goperf.dev/01-common-patterns/fields-alignment/#why-...

Cthulhu_ 257 days ago [-]

Yup, it's a fairly low-level language intended as a replacement to C/C++ but for modern day systems (networked, concurrent, etc). You don't have manual memory management per se but you still need to decide on heap vs stack and consider the hardware.

jerf 257 days ago [-]

"you still need to decide on heap vs stack"

No, you can't decide on heap vs stack. Go's compiler decides that. You can get feedback about the decision if you pass the right debug flags, and then based on that you may be able to tickle the optimizer into changing its mind based on code changes you make, but it'll always be an optimization decision subject to change without notice in any future versions of Go, just like any other language where you program to the optimizer.

If you need that level of control, Go is generally not the right language. However, I would encourage developers to be sure they need that level of control before taking it, and that's not special pleading for Go but special pleading for the entire class of "languages that are pretty fast but don't offer quite that level of control". There's still a lot of programmers running around with very 200x ideas of performance, even programmers who weren't programmers at the time, who must have picked it up by osmosis.

(My favorite example to show 200x perf ideas is paginated APIs where the "pages" are generally chosen from the set {25, 50, 100} for "performance reasons". In 2025, those are terribly, terribly small numbers. Presenting that many results to humans makes sense, but my default size for paginating API calls nowadays is closer to 1000, and that's the bottom end, for relatively expensive things. If I have no reason to think it's expensive, tack another order of magnitude on to my minimum.)

fmstephe 256 days ago [-]

Just an anecdote from work to back this up. I wrote a system that was taking requests, making another request to a service (that basically wrapped elasticsearch) and then processed the results and returned to the results to the caller.

By default the elastic-search results were paginated and defaulted to some small number in the order of 25..100. I increased this steadily upwards beyond 100,000 to the point where every request always returned the entire result in the first page. And it _transformed_ the performance of the service. From one that was unbearably slow for human users to one that _felt_ instantaneous. I had real perf numbers at the time, but now all I have are the impressions.

But the lesson on the impact of the overhead of those paginated calls was important. Obviously everything is specific and YMMV, but this something worth having in the back of your mind.

jensneuse 257 days ago [-]

You can often fool yourself by using sync.Pool. pprof looks great because no allocs in benchmarks but memory usage goes through the roof. It's important to measure real world benefits, if any, and not just synthetic benchmarks.

makeworld 257 days ago [-]

Why would Pool increase memory usage?

jensneuse 257 days ago [-]

Let's say you have constantly 1k requests per second and for each request, you need one buffer, each 1 MiB. That means you have 1 GiB in the pool. Without a pool, there's a high likelihood that you're using less. Why? Because in reality, most requests need a 1 MiB buffer but SOME require a 5 MiB buffer. As such, your pool grows over time as you don't have control over the distribution of the size of the pool items.

So, if you have predictable object sizes, the pool will stay flat. If the workloads are random, you have a new problem because, like in this scenario, your pool grows 5x more.

You can solve this problem. E.g. you can only give back items into the pool that are small enough. Alternatively, you could have a small pool and a big pool, but now you're playing cat and mouse.

In such a scenario, it could also work to simply allocate and use GC to clean up. Then you don't have to worry about memory and the lifetime of objects, which makes your code much simpler to read and reason about.

jerf 257 days ago [-]

Long before sync.Pool was a thing, I wrote a pool for []bytes: https://github.com/thejerf/gomempool I haven't taken it down because it isn't obsoleted by sync.Pool because the pool is aware of the size of the []bytes. Though it may be somewhat obsoleted by the fact the GC has gotten a lot better since I wrote it, somewhere in the 1.3 time frame. But it solve exactly that problem I had; relatively infrequent messages from the computer's point of view (e.g., a system that is probably getting messages every 50ms or so), but that had to be pulled into buffers completely to process, and had highly irregular sizes. The GC was doing a ton of work when I was allocating them all the time but it was easy to reuse buffers in my situation.

theThree 256 days ago [-]

>That means you have 1 GiB in the pool.

This only happen when every request last 1 second.

xyproto 257 days ago [-]

I guess if you allocate more than you need upfront that it could increase memory usage.

throwaway127482 257 days ago [-]

I don't get it. The pool uses weak pointers under the hood right? If you allocate too much up front, the stuff you don't need will get garbage collected. It's no worse than doing the same without a pool, right?

cplli 257 days ago [-]

What the top commenter probably failed to mention, and jensneuse tried to explain is that sync.Pool makes an assumption that the size cost of pooled items are similar. If you are pooling buffers (eg: []byte) or any other type with backing memory which during use can/will grow beyond their initial capacity, can lead to a scenario where backing arrays which have grown to MB capacities are returned by the pool to be used for a few KB, and the KB buffers are returned to high memory jobs which in turn grow the backing arrays to MB and return to the pool.

If that's the case, it's usually better to have non-global pools, pool ranges, drop things after a certain capacity, etc.:

https://github.com/golang/go/issues/23199 https://github.com/golang/go/blob/7e394a2/src/net/http/h2_bu...

nopurpose 257 days ago [-]

also no one GCs sync.Pool. After a spike in utilization, live with increased memory usage until program restart.

ncruces 257 days ago [-]

That's just not true. Pool contents are GCed after two cycles if unused.

nopurpose 257 days ago [-]

What do you mean? Pool content can't be GCed , because there are references to it: pool itself.

What people do is what this article suggested, pool.Get/pool.Put, which makes it only grow in size even if load profile changes. App literally accumulated now unwanted garbage in pool and no app I have seen made and attempt to GC it.

ahmedtd 256 days ago [-]

From the sync.Pool documentation:

> If the Pool holds the only reference when this happens, the item might be deallocated.

Conceptually, the pool is holding a weak pointer to the items inside it. The GC is free to clean them up if it wants to, when it gets triggered.

ashf023 257 days ago [-]

sync.Pool uses weak references for this purpose. The pool does delay GC, and if your pooled objects have pointers, those are real and can be a problem. If your app never decreases the pool size, you've probably reached a stable equilibrium with usage, or your usage fits a pattern that GC has trouble with. If Go truly cannot GC your pooled objects, you probably have a memory leak. E.g. if you have Nodes in a graph with pointers to each other in the pool, and some root pointer to anything in the pool, that's a memory leak

andrewf 256 days ago [-]

https://github.com/golang/go/blob/master/src/sync/pool.go#L2...

The GC calls out to sync.Pool's cleanup.

257 days ago [-]

nikolayasdf123 257 days ago [-]

nicely organised. I feel like this could grow into community driven current state-of-the-art of optimisation tips for Go. just need to allow people edit/comment their input easily (preferably in-place). I see there is github repo, but my bet people would not actively add their input/suggestions/research there, it is hidden too far from the content/website itself

whalesalad 257 days ago [-]

For sure. Feels like the broader dev community could use a generic wiki platform like this, where every language or toolkit can have its own section. Not just for performance/optimization, but also for idiomatic ways to use a language in practice.

inadequatespace 256 days ago [-]

Why doesn’t the compiler pack structs for you if it’s as easy as shuffling around based on type?

greatgib 253 days ago [-]

Because the organization of your struct is exactly how the memory have to be organized and that might be important for you. The compiler doesn't know your intended usage so it can't rework the structure at its will.

For example, you might take the block of memory and data and send it to another system that will decode it. Or you can take the block of memory and store it in a file or in a hardware device where it means something in this specific order.

kunley 257 days ago [-]

"Although the struct Data contains a [1024]int array, which is 4 KB (assuming int is 4 bytes on the architecture used)"

Huh,what?

I mean, who uses 32b architecture by default?

bombela 257 days ago [-]

Most C/C++ compilers have 32b int on 64b arch. Maybe the confusion comes from that.

Also it would be 4KiB not 4KB.

kunley 251 days ago [-]

What?

Article is about the Go compiler. On 64 bit arch Go int is 64 bits.

_345 257 days ago [-]

Anyone know of a resource like this but for Python 3?

asicsp 257 days ago [-]

This might help: https://pythonspeed.com/datascience/

nikolayasdf123 257 days ago [-]

nice article. good to see statements backed up by Benchmarks right there

black_13 256 days ago [-]

[dead]

devcoder78 257 days ago [-]

[dead]

257 days ago [-]

ljm 257 days ago [-]

You're not really writing 'Go' anymore when you're optimising it, it's defeating the point of the language as a simple but powerful interface over networked services.

jrockway 257 days ago [-]

Why? You have control over the parts where control yields noticeable savings, and the rest just kind of works with reasonable defaults.

Taken to the extreme, Go is still nice even with constraints. For example, tinygo is pretty nice for microcontroller projects. You can say upfront that you don't want GC, and just allocate everything at the start of the program (kind of like how DJB writes C programs) and writing the rest of the program is still a pleasant experience.

ashf023 257 days ago [-]

100%. I work in Go and use optimizations like the ones in the article, but only in a small percentage of the code. Go has a nice balance where it's not pessimized by default, and you can just write 99% of code without thinking about these optimizations. But having this control in performance critical parts is huge. Some of this stuff is 10x, not +5%. Also, Go has very good built-in support for CPU and memory profiling which pairs perfectly with this.

emmelaich 257 days ago [-]

I think you have a point that there's generic advice for optimising: don't.

i.e. Make it simple, then measure, then make it fast if necessary.

Perhaps all this is understood for readers of the article.

Cthulhu_ 257 days ago [-]

What do you mean? If you don't want that level of control over e.g. memory allocation, registries, cache lines etc, there's higher level languages than Go you can pick from, e.g. Java / C# / JS.

mariusor 257 days ago [-]

I think at least some of the patterns shared in the document, using zero-copy, ordering struct properties are all very idiomatic. Writing code in this manner is writing good Go code.

Loading comments...

nopurpose 257 days ago [-]

liquidgecka 257 days ago [-]

Its worth calling out that abstractions can kill you in unexpected ways with go.

slashdev 257 days ago [-]

I thought surely an integer could be inlined into the interface, I thought Go used to do that. But I tried it on the playground, and it heap allocates it:

https://go.dev/play/p/zHfnQfJ9OGc

masklinn 257 days ago [-]

Go did use to do that, it was removed years ago, in 1.4: https://go.dev/doc/go1.4#runtime

kbolino 257 days ago [-]

[1]: https://github.com/teh-cmc/go-internals/blob/master/chapter2...

[2]: https://cs.opensource.google/go/go/+/refs/tags/go1.24.1:src/...

257 days ago [-]

ncruces 256 days ago [-]

slog.Value looks incredibly useful.

Just imagine a day where database/sql doesn't generate a tonne of garbage because it moves to use something like that?

ominous_prime 257 days ago [-]

go1.15 re-added small integer packing into interfaces: https://go.dev/doc/go1.15#runtime

masklinn 257 days ago [-]

This array is also used for single-byte strings (which previously had its own array): https://go-review.googlesource.com/c/go/+/221979/3/src/runti...

ominous_prime 257 days ago [-]

It didn't, do what? I would consider the first 256 integers to be "small integers" ;)

> Converting a small integer value into an interface value no longer causes allocation

I forgot that it can also be used for single byte strings, That's not an optimization I ever encountered being useful, but it's there!

masklinn 257 days ago [-]

> It didn't, do what?

Reintroduce “packing into interfaces”.

It did a completely different thing. Small integers remain not inlined.

MarkMarine 257 days ago [-]

Are you including in this analysis the amount of time/resources it takes to allocate? GC isn't the only thing you want to minimize for when you're making a high performance system.

nopurpose 257 days ago [-]

From that perspective it boils down to "do less", which is what any perf guide already includes, allocations is just no different from anything else what app do.

MarkMarine 257 days ago [-]

zmj 257 days ago [-]

Pretty similar story in .NET. Make sure your inner loops are allocation-free, then ensure allocations are short-lived, then clean up the long tail of large allocations.

neonsunset 257 days ago [-]

.NET is far more tolerant to high allocation traffic since its GC is generational and overall more sophisticated (even if at the cost of tail latency, although that is workload-dependent).

Doing huge allocations which go to LOH is quite punishing, but even substantial inter-generational traffic won't kill it.

aktau 257 days ago [-]

Side note: see https://tip.golang.org/doc/gc-guide for more on how the Go GC works and what triggers it.

GC frequency is directly driven by allocation rate (in terms of bytes) and live heap size. Some examples:

  - If you halve the allocation rate, you halve the GC frequency.
  - If you double the live heap size, you halve the GC frequency (barring changes away from the default `GOGC=100`).

> ...but if you look at pprof of a Go app, GC mark phase is what takes time, not GC sweep.

It is true that sweeping is a lot cheaper than marking, which makes your next statement:

> Short lived allocations, those which GC mark phase will never reach, has almost neglible effect on GC times.

...technically correct. Usually, this is the best kind of correct, but it omits two important considerations:

  - If you generate a ton of short-lived allocations instead of keeping them around, the GC will trigger more frequently.
  - If you reduce the live heap size (by not keeping anything around), the GC will trigger more frequently.

So now you have cheaper GC cycles, but many more of them. On top of that, you have vastly increased allocation costs.

It is not a priori clear to me this is a win. In my experience, it isn't.

deepsun 256 days ago [-]

nopurpose 257 days ago [-]

I enjoyed your detailed response, it adds value to this discussion, but I feel you missed the point of my comment.

If we generalize then my "avoid long-lived allocations" or yours "reduce allocation rate in terms of bytes" are much more useful in practice, than what this and many other articles preach.

kgeist 256 days ago [-]

zbobet2012 257 days ago [-]

ncruces 257 days ago [-]

The point is not to avoid GC entirely, but to reduce allocation pressure.

If you can avoid allocs in a hot loop, it definitely pays to do so. If you can't for some reason, and can use sync.Pool there, measure it.

I've gotten better than two fold performance increases on real code with both techniques.

bboreham 257 days ago [-]

felixge 257 days ago [-]

+1. In particular []byte slice allocations are often a significant driver of GC pace while also being relatively easy to optimize (e.g. via sync.Pool reuse).

raggi 257 days ago [-]

You might wanna look at a system profiler too, pprof doesn't show everything.

Capricorn2481 257 days ago [-]

Aren't allocations themselves pretty expensive regardless of GC?

nu11ptr 257 days ago [-]

aktau 257 days ago [-]

There have recently been some optimizations to `runtime.mallocgc`, which may have decrease that 3-4x estimate a bit.

nu11ptr 257 days ago [-]

aktau 256 days ago [-]

I should correct and clarify: I meant 3-4x more expensive in relative terms. Meaning:

  - For C++ programs, the allocator (allocating+freeing) consumes roughly 5%  of cycles.
  - For Go programs, the allocator (runtime.mallocgc) used to consume ~20% of cycles (this is the data I referenced). I checked and recently it's become closer to 15%, thanks to optimizations.

I have not tested the performance differential on a per-byte level (though that will also differ with object structure in Go).

epcoa 257 days ago [-]

No. If you have a moving multi generational GC, allocation is literally just an increment for short lived objects.

burch45 257 days ago [-]

This is about go not Java. Go makes different tradeoffs and does not have moving multigenerational GC.

257 days ago [-]

pebal 257 days ago [-]

If you have a moving, generational GC, then all the benefits of fast allocation are lost due to data moving and costly memory barriers.

gf000 257 days ago [-]

Also, write-only barriers don't have that big of an overhead.

pebal 257 days ago [-]

gf000 256 days ago [-]

Is that period really that big of a concern when your threads in any language might be context switched away by the OS? It's not a common occurrence on a CPU-timeline at all.

Also, it's no accident that every high-performance GC runtime went the moving, generational way.

pebal 256 days ago [-]

That time may seem negligible, since the OS can context switch threads anyway, but it’s still additional time during which your code isn’t doing its actual work.

gf000 256 days ago [-]

I would rather say that generations are a further improvement upon a moving collector, improving space usage and decreasing the length of the "mark" phase.

pebal 256 days ago [-]

Java currently has no fully concurrent GC, and due to the volume of garbage it manages and the fact that it moves objects, a truly fully concurrent GC for this language is unlikely to ever exist.

Non-moving GCs, however, can be fully concurrent — as demonstrated by the SGCL project for C++.

In my opinion, the GC for Go is the most likely to become fully concurrent in the future.

gf000 256 days ago [-]

Is SGCL your project?

Also, have you perhaps tried modeling your design in something like TLA+?

pebal 256 days ago [-]

Yes, SGCL is my project.

I’ve never tried modeling SGCL's algorithms in TLA+.

257 days ago [-]

pgwhalen 256 days ago [-]

It’s in uncharitable to say the benefits are lost - I’d reframe it as creating tradeoffs.

257 days ago [-]

deepsun 256 days ago [-]

Interesting, and I think that is not specific to Go, other mark-and-sweep GCs (Java, C#) should behave the same.

Which means that creating short lived objects (like iterators for loops, or some wrappers) is ok.

ted_dunning 256 days ago [-]

Not entirely. Go still doesn't have a generational collector so high allocation rates cause more GC's that must examine long-lived objects.

As such, short-lived objects have little impact in Java (thank god for that!). They will have second order effects in Go.

int_19h 256 days ago [-]

nurettin 257 days ago [-]

Is it worth making short lived allocations just to please the GC? You might just end up with too many allocations which will slow things down even more.

aktau 257 days ago [-]

It is not. Please see my answer (https://news.ycombinator.com/item?id=43545500).

stouset 257 days ago [-]

Checking out the first example—object pools—I was initially blown away that this is not only possible but it produces no warnings of any kind:

    pool := sync.Pool{
        New: func() any { return 42 }
    }

    a := pool.Get()

    pool.Put("hello")
    pool.Put(struct{}{})

    b := pool.Get()
    c := pool.Get()
    d := pool.Get()

    fmt.Println(a, b, c, d)

Is a type system all that helpful if you have to keep turning it off any time you want to do something even slightly interesting?

ncruces 257 days ago [-]

This is still strong typing, even it it's not static typing.

It's static vs. dynamic and strong vs. weak.

https://stackoverflow.com/a/11889763

9rx 257 days ago [-]

It is strong, static, and structural. But structural typing is effectively compile-time duck typing, so it is understandable that some might confuse it with dynamic typing.

masklinn 256 days ago [-]

Ggp is not talking about structural typing, but about sync.Pool type erasing (it takes `any` values, and returns `any` values). So you can put (and will retrieve) random garbage from it.

9rx 256 days ago [-]

> Ggp is not talking about structural typing,

> but about sync.Pool type erasing

The type isn't erased...?

    p := sync.Pool{New: func() interface{} { return 1 }}
    fmt.Printf("%T", p.Get()) // Prints: int

> So you can put (and will retrieve) random garbage from it.

And you think that makes it dynamically typed? It does not.

zaphodias 257 days ago [-]

While I think you're right (generics might be useful there), it's fairly easy to wrap the `sync` primitives such as `sync.Pool` and `sync.Map` into your specific use case.

aktau 257 days ago [-]

Upstream thinks a type-safer `sync.Pool` is a good idea too. It's being discussed in https://go.dev/issue/71076.

Someone 257 days ago [-]

> While I think you're right (generics might be useful there), it's fairly easy to wrap the `sync` primitives such as `sync.Pool` and `sync.Map` into your specific use case.

zaphodias 257 days ago [-]

strangelove026 257 days ago [-]

Sync.map is meant to have poor performance I believe

https://github.com/golang/go/issues/21031

PhilippGille 257 days ago [-]

It depends on the use case.

From the Godoc:

Source: https://pkg.go.dev/sync#Map

And regarding slow writes, those were recently improved in Go 1.24:

Source: https://go.dev/doc/go1.24#minor_library_changes ("sync" section)

jlouis 257 days ago [-]

It is fairly common your type system ends up with escape hatches allowing you to violate the type rules in practice. See e.g., OCaml and the function "magic" in the Obj module.

It serves as a way around a limitation in the type system which you don't want to deal with.

You can still have the rest of the code base be safe, as long as you create a wrapper which is.

stouset 257 days ago [-]

Obviously every type system in practice has escape hatches. But I’ve never seen another staticly-typed language where you need to break out of the type system so regularly.

Go’s type system has your back when you’re writing easy stuff.

But it throws up its hands and leaves you to fend for yourself when you need to do nearly anything interesting or complex, which is precisely when I want the type system to have my back.

I should not have to worry (or worse, not worry and be caught off guard) that my pool of database connections suddenly starts handing back strings.

int_19h 256 days ago [-]

> But I’ve never seen another staticly-typed language where you need to break out of the type system so regularly.

It's about the same as Java and C# prior to their adoption of generics, and largely for the same reasons.

jfwwwfasdfs 257 days ago [-]

A lot of languages have top types

tgv 257 days ago [-]

> Also I can't help but notice that there's no API to reset values to some initialized default.

That's what the New function does, isn't it?

BTW, the code you posted isn't syntactically correct. It needs a comma on the second line.

gwd 257 days ago [-]

> That's what the New function does, isn't it?

But that's only run when the pool needs to allocate more space. What GP seems to expect is that sync.Pool() would always return a zeroed structure, just as Golang allocation does.

> [any] is a type

It's typed the way Python is typed, not the way Rust or C are typed; so loses the "if it compiles there's a good chance it's correct" property that people want from statically typed languages.

I don't use sync.Pool, but it does seem like now that we have generics, having a typed pool would be better.

9rx 257 days ago [-]

> so loses the "if it compiles there's a good chance it's correct" property that people want from statically typed languages.

gwd 257 days ago [-]

9rx 257 days ago [-]

> where doing a basic compile will catch a fair number of errors

In the case of refactoring this is incredibly useful. It doesn't say much about the correctness of your program, though.

stouset 257 days ago [-]

> But that's only run when the pool needs to allocate more space. What GP seems to expect is that sync.Pool() would always return a zeroed structure, just as Golang allocation does.

Not quite that. Imagine I have a pool of buffers with a length and capacity, say when writing code to handle receiving data from the network.

When I put one of those buffers back, I would like the next user of that buffer to get it back emptied. The capacity should stay the same, but the length should be zero.

This also should be an optional callback since there are many cases where you don’t want any form of object reset.

ignoramous 257 days ago [-]

> What GP seems to expect is that sync.Pool() would always return a zeroed structure, just as Golang allocation does.

One could define a new "Pool[T]" type (extending sync.Pool) to get these guarantees:

  type Pool[T any] sync.Pool  // typed def

  func (p *Pool[T]) Get() T { // typed Get
      pp := (*sync.Pool)(p)
      return pp.Get().(T)
  }

  func (p *Pool[T]) Put(v T) { // typed Put
      pp := (*sync.Pool)(p)
      pp.Put(v)
  }

  intpool := Pool[int]{        // alias New
      New: func() any { var zz int; return zz },
  }

  boolpool := Pool[bool]{      // alias New
      New: func() any { var zz bool; return zz },
  }

https://go.dev/play/p/-WG7E-CVXHR

9rx 257 days ago [-]

> One could define a new "Pool[T]" type (extending sync.Pool) to get these guarantees:

So long as that one is not you? You completely forgot to address the expectation:

    type Foo struct{ V int }
    pool := Pool[*Foo]{ // Your Pool type.
        New: func() any { return new(Foo) },
    }

    a := pool.Get()
    a.V = 10
    pool.Put(a)

    b := pool.Get()
    fmt.Println(b.V) // Prints: 10; 0 was expected.

ignoramous 257 days ago [-]

> You completely forgot to address the expectation

> fmt.Println(b.V) // Prints: 10; 0 was expected.

Sorry, I don't get what else one expects when pooling pointers to a type? In fact, pooling *[]uint8 or *[]byte is common place; Pool.Put() or Pool.Get() then must zero its contents.

9rx 257 days ago [-]

> I don't get what else one expects when pooling pointers to a type?

> Pool.Put() or Pool.Get() then must zero its contents.

ignoramous 257 days ago [-]

> And, unfortunately, doesn't even get the generic constraints right, as demonstrated with the int and bool examples.

If those constraints don't hold (like you say) it should manifest as runtime panic, no?

> What GP seems to expect is that sync.Pool() would always return a zeroed structure

Ah, well. You gots to be careful when Pooling addresses.

> But you completely forgot to do it, which questions what your code was for?

9rx 257 days ago [-]

> If those constraints don't hold (like you say) it should manifest as runtime panic, no?

No. Your int and bool pools run just fine – I can't imagine you would have posted the code if it panicked – but are not correct.

> I did not forget?

Then your guarantee is bunk: "One could define a new "Pool[T]" type (extending sync.Pool) to get these guarantees:" Why are you making claims you won't stand behind?

ignoramous 257 days ago [-]

It was a blueprint. Embedding and typedefs are ways to implement these guarantees. And of course, writing a generic pool library is not what I was after.

> but are not correct.

9rx 257 days ago [-]

> What does it even mean?

Values are copied in Go. Your code will function, but it won't work.

You've left it up to the user of the pool to not screw things up. Which is okay to some degree, but sync.Pool already does that alone, so what is your code for?

ignoramous 257 days ago [-]

> Values are copied in Go

Gotcha. Thanks for clearing it up.

> so what is your code for?

9rx 257 days ago [-]

> then the code was to demonstrate that sync.Pool could be "extended" with other types and custom logic.

Furthermore, who, exactly, do you think would be familiar enough with Go to get all the other things right that you left out but be unaware of that standard, widely used feature?

ignoramous 257 days ago [-]

> Wherein lies the aforementioned guarantee?

I think you should re-read what I wrote. You seem to be upset that I did not solve everyone's problem with sync.Pool with my 10 liner (when I claimed no such thing).

  One could define a new "Pool[T]" type (extending sync.Pool) to get these guarantees

9rx 257 days ago [-]

> I think you should re-read what I wrote.

You "forgot" to copy the colon from the original statement. A curious exclusion given the semantic meaning it carries. Were you hoping I didn't read your original comment and wouldn't notice?

> You seem to be upset

> One could define / extend sync.Pool to get those guarantees [for their custom types] ...

What audience would be interested in this? Is there anyone who understands all the intricacies of sync.Pool but doesn't know how to define types or how to write functions?

ignoramous 256 days ago [-]

> You "forgot" to copy the colon from the original statement.

You got me!

257 days ago [-]

tgv 257 days ago [-]

> What GP seems to expect is that sync.Pool() would always return a zeroed structure

Might be, but that's a design decision that has nothing to do with type or generics, isn't it? You seem to refer to a function to drain the pool, which is not needed, and frankly, rather unusual.

> It's typed the way Python is typed

Not in the slightest.

> "if it compiles there's a good chance it's correct"

If you want to compare it to something, it's more like Rust's unwrap(), which will panic if you apply it to the wrong result.

gwd 257 days ago [-]

> Not in the slightest.

You know, it's this kind of comment on USENET forums which prompted the creation of StackOverflow. It's not curious and adds nothing to the discussion.

> If you want to compare it to something, it's more like Rust's unwrap(), which will panic if you apply it to the wrong result.

9rx 257 days ago [-]

> But it's simply a fact that using the `any` type means that certain properties of the program can't be checked at compile time

Yes, structural typing removes the ability to check certain properties at compile-time. That doesn't make it typed like Python, though.

int_19h 256 days ago [-]

"any" is not structural typing.

9rx 256 days ago [-]

any isn't a special type. It is an alias for interface{}.

The empty set is trivially satisfied by all types, but obviously can be narrowed as you see fit.

int_19h 252 days ago [-]

tgv 257 days ago [-]

Sorry, but comparing Python's total absence of typing to extracting a value from any is quite weird.

> certain properties of the program can't be checked at compile time

> the compiler won't prevent you from doing it.

It will prevent you from using e.g. an array of strings as if it were an array ints. Python does not. They are quite different.

> You know, it's this kind of comment on USENET forums which prompted the creation of StackOverflow. It's not curious and adds nothing to the discussion.

Ffs.

sophacles 256 days ago [-]

tgv 256 days ago [-]

The language itself is practically devoid of restrictions on data types. There only are standard lib functions that check the type of their arguments.

zaphodias 257 days ago [-]

I assume they're referring to the fact that a Pool can hold different types instead of being a collection of items of only one homogeneous type.

eptcyka 257 days ago [-]

Is there a time in your career where an object pool absolutely had to contain an unbounded set of types? Any time when you would try know at compile time the total set of types a pool should contain?

pyrale 257 days ago [-]

> There's no way in Java, Rust or C++ to express this either

You make it look like it's a good thing to be able to express it.

There's no way in Java, Rust or C++ to express this, praised be the language designers.

int_19h 256 days ago [-]

> There's no way in Java, Rust or C++ to express this, praised be the language designers.

In C++, there's no common supertype, but there std::any, which can contain a value of any type and be downcast if you know what the actual type is.

sophacles 256 days ago [-]

tgv 256 days ago [-]

> You make it look like it's a good thing to be able to express it.

No, just that this pre-generics Go, and backwards compatibility is taken seriously.

gf000 257 days ago [-]

How is it different than pre-generic Java?

sapiogram 257 days ago [-]

> You never programmed in Go, I assume?

You might want to step off that extremely high horse for a second, buddy. It's extremely reasonable to expect a type-safe pool that only holds a single type, since that's the most common use case.

kevmo314 257 days ago [-]

jasonthorsness 257 days ago [-]

Go's network interfaces and slices makes this kind of thing particularly simple - I had to do the same thing in Java and it was a lot more awkward.

roundup 257 days ago [-]

Additionally...

- https://go101.org/optimizations/101.html

- https://github.com/uber-go/guide

I wish this content existed as a model context protocol (MCP) tool to connect to my IDE along w/ local LLM.

After 6 months or switching between different language projects, it's challenging to remember all the important things.

jigneshdarji91 257 days ago [-]

Additionally... - https://www.uber.com/en-AU/blog/how-we-saved-70k-cores-acros...

TechDebtDevin 257 days ago [-]

Embedding those docs in your MCP server takes about 5 seconds with mcp-go's AddResource method

https://github.com/mark3labs/mcp-go/blob/main/examples/every...

255 days ago [-]

jrockway 257 days ago [-]

donatj 257 days ago [-]

Unpopular opinion maybe, but sync.Pool is so sharp, dangerous and leaky that I'd avoid using it unless it's your absolute last option. And even then, maybe consider a second server first.

infogulch 257 days ago [-]

A new sync/v2 NewPool() is being discussed that eliminates the sharp edges by making it generic: https://github.com/golang/go/issues/71076

I haven't personally found it to be problematic; just keep it private, give it a default new func, and be cautious about only putting things in it that you got out.

nasretdinov 256 days ago [-]

But the instrument itself is really sharp and is indeed kind of last resort

parhamn 257 days ago [-]

Noticed the object pooling doc, had me wondering: are there any plans to make packages like `sync` generic?

arccy 257 days ago [-]

eventually: https://github.com/golang/go/issues/71076

dennis-tra 257 days ago [-]

Can someone explain to me why the compiler can’t do struct-field-alignment? This feels like something that can easily be automated.

CamouflagedKiwi 257 days ago [-]

I assume the thinking was that this is pretty easy to optimise if you care, and if it's on by default there'd then have to be some opt-out which there isn't a good mechanism for.

9rx 257 days ago [-]

> and if it's on by default there'd then have to be some opt-out which there isn't a good mechanism for.

Good is subjective, but the mechanism is something already implemented: https://pkg.go.dev/structs#HostLayout

CamouflagedKiwi 256 days ago [-]

Oh interesting, I've not encountered that before - I suppose because it is currently the default behaviour.

What I was hoping not to find (and fortunately didn't!) was one of Go's magical syntactic comments.

kbolino 257 days ago [-]

In particular, struct field alignment matches C (even without cgo) and so any change to the default would break a lot of code.

9rx 256 days ago [-]

> struct field alignment matches C (even without cgo)

> so any change to the default would break a lot of code.

kbolino 256 days ago [-]

[1]: https://github.com/ebitengine/purego/issues/259

arp242 256 days ago [-]

> There is nothing in the spec about struct layout

De-facto a lot of programs rely on it, so whatever the spec says is irrelevant.

Not just for cgo by the way, but also things like binary.Read()/Write().

9rx 256 days ago [-]

> but also things like binary.Read()/Write().

Did you mean something else?

kbolino 254 days ago [-]

I had to think about this one a bit.

int_19h 256 days ago [-]

This is very unfortunate, since most structs are never going to be passed to C, yet end up paying the tax anyway. They really should have made it opt-in.

masklinn 256 days ago [-]

It can. Rust does.

That requires a way to opt out tho, because there are situations where you need a specific field ordering, so now the langage needs to provide way to tune struct compilation behaviour.

9rx 257 days ago [-]

__turbobrew__ 256 days ago [-]

Calling mmap “zero copy” is generous. I guess we glaze over the whole page fault thing, or the fact that performance is heavily dependent on how much memory pressure the process is under.

This is the same n00b trap that derailed the llama.cpp project last year because people don’t understand how memory maps and paging works, and the tradeoffs.

neillyons 257 days ago [-]

Curious to know what people are building where you need to optimise like this? eg Struct Field Alignment https://goperf.dev/01-common-patterns/fields-alignment/#avoi...

dundarious 257 days ago [-]

False sharing is an absolutely classic Concurrency 101 lesson, nothing remarkable about it.

kubb 257 days ago [-]

Something that shouldn’t be written in a GC language.

Cthulhu_ 257 days ago [-]

GC is not relevant in this case, it's about whether you can make structs fit in cache lines and CPU registers. Mechanical sympathy is the googleable phrase. GC is a few layers further away.

piokoch 257 days ago [-]

I don't think GC has anything to do here, doing manual memory allocation we might hit the same problem.

EdwardDiego 257 days ago [-]

Huh, this surprises me about Golang, didn't realise it was so similar to C with struct alignment. https://goperf.dev/01-common-patterns/fields-alignment/#why-...

Cthulhu_ 257 days ago [-]

jerf 257 days ago [-]

"you still need to decide on heap vs stack"

fmstephe 256 days ago [-]

But the lesson on the impact of the overhead of those paginated calls was important. Obviously everything is specific and YMMV, but this something worth having in the back of your mind.

jensneuse 257 days ago [-]

makeworld 257 days ago [-]

Why would Pool increase memory usage?

jensneuse 257 days ago [-]

So, if you have predictable object sizes, the pool will stay flat. If the workloads are random, you have a new problem because, like in this scenario, your pool grows 5x more.

You can solve this problem. E.g. you can only give back items into the pool that are small enough. Alternatively, you could have a small pool and a big pool, but now you're playing cat and mouse.

jerf 257 days ago [-]

theThree 256 days ago [-]

>That means you have 1 GiB in the pool.

This only happen when every request last 1 second.

xyproto 257 days ago [-]

I guess if you allocate more than you need upfront that it could increase memory usage.

throwaway127482 257 days ago [-]

cplli 257 days ago [-]

If that's the case, it's usually better to have non-global pools, pool ranges, drop things after a certain capacity, etc.:

https://github.com/golang/go/issues/23199 https://github.com/golang/go/blob/7e394a2/src/net/http/h2_bu...

nopurpose 257 days ago [-]

also no one GCs sync.Pool. After a spike in utilization, live with increased memory usage until program restart.

ncruces 257 days ago [-]

That's just not true. Pool contents are GCed after two cycles if unused.

nopurpose 257 days ago [-]

What do you mean? Pool content can't be GCed , because there are references to it: pool itself.

ahmedtd 256 days ago [-]

From the sync.Pool documentation:

> If the Pool holds the only reference when this happens, the item might be deallocated.

Conceptually, the pool is holding a weak pointer to the items inside it. The GC is free to clean them up if it wants to, when it gets triggered.

ashf023 257 days ago [-]

andrewf 256 days ago [-]

https://github.com/golang/go/blob/master/src/sync/pool.go#L2...

The GC calls out to sync.Pool's cleanup.

257 days ago [-]

nikolayasdf123 257 days ago [-]

whalesalad 257 days ago [-]

inadequatespace 256 days ago [-]

Why doesn’t the compiler pack structs for you if it’s as easy as shuffling around based on type?

greatgib 253 days ago [-]

kunley 257 days ago [-]

"Although the struct Data contains a [1024]int array, which is 4 KB (assuming int is 4 bytes on the architecture used)"

Huh,what?

I mean, who uses 32b architecture by default?

bombela 257 days ago [-]

Most C/C++ compilers have 32b int on 64b arch. Maybe the confusion comes from that.

Also it would be 4KiB not 4KB.

kunley 251 days ago [-]

What?

Article is about the Go compiler. On 64 bit arch Go int is 64 bits.

_345 257 days ago [-]

Anyone know of a resource like this but for Python 3?

asicsp 257 days ago [-]

This might help: https://pythonspeed.com/datascience/

nikolayasdf123 257 days ago [-]

nice article. good to see statements backed up by Benchmarks right there

black_13 256 days ago [-]

[dead]

devcoder78 257 days ago [-]

[dead]

257 days ago [-]

ljm 257 days ago [-]

You're not really writing 'Go' anymore when you're optimising it, it's defeating the point of the language as a simple but powerful interface over networked services.

jrockway 257 days ago [-]

Why? You have control over the parts where control yields noticeable savings, and the rest just kind of works with reasonable defaults.

ashf023 257 days ago [-]

emmelaich 257 days ago [-]

I think you have a point that there's generic advice for optimising: don't.

i.e. Make it simple, then measure, then make it fast if necessary.

Perhaps all this is understood for readers of the article.

Cthulhu_ 257 days ago [-]

What do you mean? If you don't want that level of control over e.g. memory allocation, registries, cache lines etc, there's higher level languages than Go you can pick from, e.g. Java / C# / JS.

mariusor 257 days ago [-]

I think at least some of the patterns shared in the document, using zero-copy, ordering struct properties are all very idiomatic. Writing code in this manner is writing good Go code.