Squid, the venerable proxy/caching server, has recently undergone a few major changes.
In 3.2, one of the biggest changes is SMP ( multi-cpu or -core ) support. This could potentially have a huge impact on the performance scalability of a machine that uses multiple CPUs or cores. Previously ( in <= 3.1 ), Squid was a single process application that consumed a single CPU – no matter how many you had in a server. Some components like helpers or back-end drivers could make use of other cores/CPUs but the main Squid process was essentially a serial process. 3.2 changes that with a new ‘worker’ concept where Squid spawns multiple worker processes to utilise some or all available CPU cores. Workers are almost individual Squid processes however they do share a number of components including:
- the Squid executable
- general configuration
- listening ports
- memory object cache ( depending on environment )
- disk object cache when using the Rock store
- some cache manager stats
The following components are not shared:
- disk object cache ( ufs, aufs, etc. )
- dns caches
- snmp stats
- helper processes and daemons
- stateful HTTP authentication
- delay pools
- some cache manager stats
The interesting thing about the new SMP features is that workers may have different configurations ( eg. listening ports ). All workers that share http_port listen on the same IP address and TCP port. The operating system protects the shared listening socket with a lock and decides which worker gets the new HTTP connection waiting to be accepted. Once the incoming connection is accepted by the worker, it stays with that worker. This should give you a balanced operation between workers. Or so you would think. However there is kernel TCP stack code responsible for a scheduling behavior which results in a fairly large skew in terms of how much work a particular CPU or core does. The Squid guys have added a small patch ( until such time as they can locate the odd scheduling behaviour ) that, to an extent, solves this issue. While balancing is still not perfect, there is a much better distribution of CPU resource usage which should allow you to scale the app better on larger SMP systems.
Using a different listing port for each worker should allow one to control the spread of requests between workers if that is something you’d like to do.
There is however a big dependency when using workers – you need to use a different back-end store for each worker ( unless using the Rock store ). This in and of itself, is actually not a bad thing because you can now allocate separate disks ( or RAID sets ) to each worker, thereby growing the I/O capability of your system. That is better than running with just a single disk or RAID set. By the way, performance in Squid is best served using single or R1 disks as parity RAID systems can impact performance massively. ( Do not use multiple cache_dir’s on a single disk or RAID set! )
The other big change in 3.2 is a demand-based system for helpers and helper multiplexers. This allows one to set a maximum no. of helpers that Squid is able to use, however only a base no. will be started and ramped up from there as required.
Another big change with helpers is a name change to standardise naming and provide an easier understanding of what each helper does.
Logging is now modularised and uses a separate daemon to provide improved performance in SMP environments. Asynchronous buffering of log writes is also supported so that the logging system does not impact on proxy write operations.
There are a no. of other changes including configuration tags ( added, changed and removed ), Surrogate/1.0 protocol extensions to HTTP and Solaris pthreads support.
Two new helpers make their appearance in Squid 3.3. The first is the SQL db logging daemon, which can now log entries ( in native squid format ) directly to a SQL db ( using the perl db abstraction layer ). This provides some interesting options for people that are currently using custom scripts to get squid data into databases.
The next is a time quota helper ( implemented through ACLs ) which can be used to allocate time budgets for using squid.
Other improvements in 3.3 include SSL bump serer first ( rather than client first ), SSL server cert mimic and custom http request headers. There are some changes to config tags due to the above features but mostly everything stays the same as for 3.2.