Wednesday 27 November 2019

Twice and thrice over, as they say, good is it to repeat and review what is good.

Three years ago I wrote about using the AFL fuzzer to find bugs in several NetSurf libraries. I have repeated this exercise a couple of times since then and thought I would summarise what I found with my latest run.

I started by downloading the latest version of AFL (2.52b) and compiling it. This went as smoothly as one could hope for and I experienced no issues although having done this several times before probably helps.

libnsbmp

I started with libnsbmp which is used to render windows bmp and ico files which remains a very popular format for website Favicons. The library was built with AFL instrumentation enabled, some output directories were created for results and a main and four subordinate fuzzer instances started.

vince@workshop:libnsbmp$ LD=afl-gcc CC=afl-gcc AFL_HARDEN=1 make VARIANT=debug test
afl-cc 2.52b by <lcamtuf@google.com>
afl-cc 2.52b by <lcamtuf@google.com>
afl-cc 2.52b by <lcamtuf@google.com>
 COMPILE: src/libnsbmp.c
afl-cc 2.52b by <lcamtuf@google.com>
afl-as 2.52b by <lcamtuf@google.com>
[+] Instrumented 633 locations (64-bit, hardened mode, ratio 100%).
      AR: build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/libnsbmp.a
 COMPILE: test/decode_bmp.c
afl-cc 2.52b by <lcamtuf@google.com>
afl-as 2.52b by <lcamtuf@google.com>
[+] Instrumented 57 locations (64-bit, hardened mode, ratio 100%).
    LINK: build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp
afl-cc 2.52b by <lcamtuf@google.com>
 COMPILE: test/decode_ico.c
afl-cc 2.52b by <lcamtuf@google.com>
afl-as 2.52b by <lcamtuf@google.com>
[+] Instrumented 71 locations (64-bit, hardened mode, ratio 100%).
    LINK: build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_ico
afl-cc 2.52b by <lcamtuf@google.com>
Test bitmap decode
Tests:1053 Pass:1053 Error:0
Test icon decode
Tests:609 Pass:609 Error:0
    TEST: Testing complete
vince@workshop:libnsbmp$ mkdir findings_dir graph_output_dir
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f02 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f02.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f03 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f03.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f04 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f04.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -S f05 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null > findings_dir/f05.log >&1 &
vince@workshop:libnsbmp$ afl-fuzz -i test/ns-afl-bmp/ -o findings_dir/ -M f01 ./build-x86_64-linux-gnu-x86_64-linux-gnu-debug-lib-static/test_decode_bmp @@ /dev/null

The number of subordinate fuzzer instances was selected to allow the system in question (AMD 2600X) to keep all the cores in use with a clock of 4GHz which gave the highest number of
AFL master instance after six days
executions per second. This might be improved with better cooling but I have not investigated this.

After five days and six hours the "cycle count" field on the master instance had changed to green which the AFL documentation suggests means the fuzzer is unlikely to discover anything new so the run was stopped.

Just before stopping the afl-whatsup tool was used to examine the state of all the running instances.

vince@workshop:libnsbmp$ afl-whatsup -s ./findings_dir/
status check tool for afl-fuzz by <lcamtuf@google.com>

Summary stats
=============

       Fuzzers alive : 5
      Total run time : 26 days, 5 hours
         Total execs : 2873 million
    Cumulative speed : 6335 execs/sec
       Pending paths : 0 faves, 0 total
  Pending per fuzzer : 0 faves, 0 total (on average)
       Crashes found : 0 locally unique

Just for completeness there is also the graph of how the fuzzer performed over the run.

AFL fuzzer performance over libnsbmp run

There were no crashes at all (and none have been detected through fuzzing since the original run) and the 78 reported hangs were checked and all actually decode in a reasonable time. It seems the fuzzer "hang" detection default is simply a little aggressive for larger images.

libnsgif

I went through a similar setup with libnsgif which is used to render the GIF image format. The run was performed on a similar system running for five days and eighteen hours. The outcome was similar to libnsbmp with no hangs or crashes.


vince@workshop:libnsgif$ afl-whatsup -s ./findings_dir/
status check tool for afl-fuzz by <lcamtuf@google.com>

Summary stats
=============

       Fuzzers alive : 5
      Total run time : 28 days, 20 hours
         Total execs : 7710 million
    Cumulative speed : 15474 execs/sec
       Pending paths : 0 faves, 0 total
  Pending per fuzzer : 0 faves, 0 total (on average)
       Crashes found : 0 locally unique

libsvgtiny

AFL fuzzer results for libsvgtiny
I then ran the fuzzer on the SVG render library using a dictionary to help the fuzzer cope with a sparse textural input format. The run was allowed to continue for almost fourteen days with no crashes or hangs detected.

In an ideal situation this run would have been allowed to continue but the system running it required a restart for maintenance.

Conclusion

The aphorism "absence of proof is not proof of absence" seems to apply to these results. While the new fuzzing runs revealed no additional failures it does not mean there are no defects in the code to find. All I can really say is that the AFL tool was unable to find any failures within the time available.

Additionally the AFL test corpus produced did not significantly change the code coverage metrics so the existing set was retained.

Will I spend the time again in future to re-run these tests? perhaps, but I think more would be gained from enabling the fuzzing of the other NetSurf libraries and picking the low hanging fruit from there than expending thousands of hours preforming these runs again.

Thursday 11 July 2019

We can make it better than it was. Better...stronger...faster.

It is not a novel observation that computers have become so powerful that a reasonably recent system has a relatively long life before obsolescence. This is in stark contrast to the period between the nineties and the teens where it was not uncommon for users with even moderate needs from their computers to upgrade every few years.

This upgrade cycle was mainly driven by huge advances in processing power, memory capacity and ballooning data storage capability. Of course the software engineers used up more and more of the available resources and with each new release ensured users needed to update to have a reasonable experience.

And then sometime in the early teens this cycle slowed almost as quickly as it had begun as systems had become "good enough". I experienced this at a time I was relocating for a new job and had moved most of my computer use to my laptop which was just as powerful as my desktop but was far more flexible.

As a software engineer I used to have a pretty good computer for myself but I was never prepared to spend the money on "top of the range" equipment because it would always be obsolete and generally I had access to much more powerful servers if I needed more resources for a specific task.

To illustrate, the system specification of my desktop PC at the opening of the millennium was:
  • Single core Pentium 3 running at 500Mhz
  • Socket 370 motherboard with 100 Mhz Front Side Bus
  • 128 Megabytes of memory
  • A 25 Gigabyte Deskstar hard drive
  • 150 Mhz TNT 2 graphics card
  • 10 Megabit network card
  • Unbranded 150W PSU
But by 2013 the specification had become:
    2013 PC build still using an awesome beige case from 1999
  • Quad core i5-3330S Processor running at 2700Mhz
  • FCLGA1155 motherboard running memory at 1333 Mhz
  • 8 Gigabytes of memory
  • Terabyte HGST hard drive
  • 1,050 Mhz Integrated graphics
  • Integrated Intel Gigabit network
  • OCZ 500W 80+ PSU
The performance change between these systems was more than tenfold in fourteen years with an upgrade roughly once every couple of years.

I recently started using that system again in my home office mainly for Computer Aided Design (CAD), Computer Aided Manufacture (CAM) and Electronic Design Automation (EDA). The one addition was to add a widescreen monitor as there was not enough physical space for my usual dual display setup.

To my surprise I increasingly returned to this setup for programming tasks. Firstly because being at my desk acts as an indicator to family members I am concentrating where the laptop was no longer had that effect. Secondly I really like the ultra wide display for coding it has become my preferred display and I had been saving for a UWQHD

Alas last month the system started freezing, sometimes it would be stable for several days and then without warning the mouse pointer would stop, my music would cease and a power cycle was required. I tried several things to rectify the situation: replacing the thermal compound, the CPU cooler and trying different memory, all to no avail.

As fixing the system cheaply appeared unlikely I began looking for a replacement and was immediately troubled by the size of the task. Somewhere in the last six years while I was not paying attention the world had moved on, after a great deal of research I managed to come to an answer.

AMD have recently staged something of a comeback with their Ryzen processors after almost a decade of very poor offerings when compared to Intel. The value for money when considering the processor and motherboard combination is currently very much weighted towards AMD.

My timing also seems fortuitous as the new Ryzen 2 processors have just been announced which has resulted in the current generation being available at a substantial discount. I was also encouraged to see that the new processors use the same AM4 socket and are supported by the current motherboards allowing for future upgrades if required.

I Purchased a complete new system for under five hundred pounds, comprising:
    New PC assembled and wired up
  • Hex core Ryzen 5 2600X Processor 3600Mhz
  • MSI B450 TOMAHAWK AMD Socket AM4 Motherboard
  • 32 Gigabytes of PC3200 DDR4 memory
  • Aero Cool Project 7 P650 80+ platinum 650W Modular PSU
  • Integrated RTL Gigabit networking
  • Lite-On iHAS124 DVD Writer Optical Drive
  • Corsair CC-9011077-WW Carbide Series 100R Silent Mid-Tower ATX Computer Case
to which I added some recycled parts:
  • 250 Gigabyte SSD from laptop upgrade
  • GeForce GT 640 from a friend
I installed a fresh copy of Debian and all my CAD/CAM applications and have been using the system for a couple of weeks with no obvious issues.

An example of the performance difference is compiling NetSurf from a clean with empty ccache used to take 36 seconds and now takes 16 which is a nice improvement, however a clean build with the results cached has gone from 6 seconds to 3 which is far less noticeable and during development a normal edit, build, debug cycle affecting only of a small number of files has gone from 400 milliseconds to 200 which simply feels instant in both cases.

My conclusion is that the new system is completely stable but that I have gained very little in common usage. Objectively the system is over twice as fast as its predecessor but aside from compiling large programs or rendering huge CAD drawings this performance is not utilised. Given this I anticipate this system will remain unchanged until it starts failing and I can only hope that will be at least another six years away.

Tuesday 19 February 2019

A very productive weekend

I just hosted a NetSurf Developer weekend which is an opportunity for us to meet up and make use of all the benefits of working together. We find the ability to plan work and discuss solutions without loosing the nuances of body language generally results in better outcomes for the project.

NetSurf Development build
Due to other commitments on our time the group has not been able to do more than basic maintenance activities in the last year which has resulted in the developer events becoming a time to catch up on maintenance rather than making progress on features.

Because of this the July and November events last year did not feel terribly productive, there were discussions about what we should be doing and bugs considered but a distinct lack of commuted code.

As can be seen from our notes this time was a refreshing change. We managed to complete a good number of tasks and actually add some features while still having discussions, addressing bugs and socialising.

We opened on the Friday evening by creating a list of topics to look at over the following days and updating the wiki notes. We also reviewed the cross compiler toolchains which had been updated to include the most recent releases for things like openssl, curl etc.

As part of this review we confirmed the decision to remove the Atari platform from active support as its toolchain builds have remained broken for over two years with no sign of any maintainer coming forward.

While it is a little sad to see a platform be removed it has presented a burden on our strained resources by requiring us to maintain a CI worker with a very old OS using tooling that can no longer be replicated. The tooling issue means a developer cannot test changes locally before committing so testing changes that affected all frontends was difficult.

Saturday saw us clear all the topics from our list which included:
  • Fixing a bug preventing compiling our reference counted string handling library.
  • Finishing the sanitizer work started the previous July
  • Fixing several bugs in the Framebuffer frontend installation.
  • Making the Framebuffer UI use the configured language for resources.
The main achievement of the day however was implementing automated system testing of the browser. This was a project started by Daniel some eight years ago but worked on by all of us so seeing it completed was a positive boost for the whole group.

The implementation consisted of a frontend named monkey. This frontend to the browser takes textural commands to perform operations (i.e. open a window or navigate to a url) and generates results in a structured text format. Monkey is driven by a python program named monkeyfarmer which runs a test plan ensuring the results are as expected.

This allows us to run a complete browsing session in an automated way, previously someone would have to manually build the browser and check the tests by hand. This manual process was tedious and was rarely completed across our entire test corpus generally concentrating on just those areas that had been changed such as javascript output.

We have combined the monkey tools and our test corpus into a CI job which runs the tests on every commit giving us assurance that the browser as a whole continues to operate correctly without regression. Now we just have the task of creating suitable plans for the remaining tests. Though I remain hazy as to why, we became inordinately amused by the naming scheme for the tools.

Google webp library gallery rendered in NetSurfWe rounded the Saturday off by going out for a very pleasant meal with some mutual friends. Sunday started by adding a bunch of additional topics to consider and we made good progress addressing these. 

We performed a bug triage and managed to close several issues and commit to fixing a few more. We even managed to create a statement of work of things we would like to get done before the next meetup.

My main achievement on the Sunday was to add WEBP image support. This uses the Google libwebp library to do all the heavy lifting and adding a new image content handler to NetSurf is pretty straightforward.

Sunday 30 September 2018

All i wanted to do is check an error code

I was feeling a little under the weather last week and did not have enough concentration to work on developing a new NetSurf feature as I had planned. Instead I decided to look at a random bug from our worryingly large collection.

This lead me to consider the HTML form submission function at which point it was "can open, worms everywhere". The code in question has a fairly simple job to explain:
  1. A user submits a form (by clicking a button or such) and the Document Object Model (DOM) is used to create a list of information in the web form.
  2. The list is then converted to the appropriate format for sending to the web site server.
  3. An HTTP request is made using the correctly formatted information to the web server.
However the code I was faced with, while generally functional, was impenetrable having accreted over a long time.

screenshot of NetSurf test form
At this point I was forced into a diversion to fix up the core URL library handling of query strings (this is used when the form data is submitted as part of the requested URL) which was necessary to simplify some complicated string handling and make the implementation more compliant with the specification.

My next step was to add some basic error reporting instead of warning the user the system was out of memory for every failure case which was making debugging somewhat challenging. I was beginning to think I had discovered a series of very hairy yaks although at least I was not trying to change a light bulb which can get very complicated.

At this point I ran into the form_successful_controls_dom() function which performs step one of the process. This function had six hundred lines of code, hundreds of conditional branches 26 local variables and five levels of indentation in places. These properties combined resulted in a cyclomatic complexity metric of 252. For reference programmers generally try to keep a single function to no more than a hundred lines of code with as few local variables as possible resulting in a CCM of 20.

I now had a choice:

  • I could abandon investigating the bug, because even if I could find the issue changing such a function without adequate testing is likely to introduce several more.
  • I could refactor the function into multiple simpler pieces.
I slept on this decision and decided to at least try to refactor the code in an attempt to pay back a little of the technical debt in the browser (and maybe let me fix the bug). After several hours of work the refactored source has the desirable properties of:

  • multiple straightforward functions
  • no function much more than a hundred lines long
  • resource lifetime is now obvious and explicit
  • errors are correctly handled and reported

I carefully examined the change in generated code and was pleased to see the compiler output had become more compact. This is an important point that less experienced programmers sometimes miss, if your source code is written such that a compiler can reason about it easily you often get much better results than the compact alternative. However even if the resulting code had been larger the improved source would have been worth it.

After spending over ten hours working on this bug I have not resolved it yet, indeed one might suggest I have not even directly considered it yet! I wanted to use this to explain a little to users who have to wait a long time for their issues to get resolved (in any project not just NetSurf) just how much effort is sometimes involved in a simple bug.

Tuesday 7 August 2018

The brain is a wonderful organ; it starts working the moment you get up in the morning and does not stop until you get into the office.

I fear that I may have worked in a similar office environment to Robert Frost. Certainly his description is familiar to those of us who have been subjected to modern "open plan" offices. Such settings may work for some types of job but for myself, as a programmer, it has a huge negative effect.

My old basement officeWhen I decided to move on from my previous job my new position allowed me to work remotely. I have worked from home before so knew what to expect. My experience led me to believe the main aspects to address when home working were:
Isolation
This is difficult to mitigate but frequent face to face meetings and video calls with colleagues can address this providing you are aware that some managers have a terrible habit of "out of sight, out of mind" management
Motivation
You are on your own a lot of the time which means you must motivate yourself to work. Mainly this is achieved through a routine. I get dressed properly, start work the same time every day and ensure I take breaks at regular times.
Work life balance
This is often more of a problem than you might expect and not in the way most managers assume. A good motivated software engineer can have a terrible habit of suddenly discovering it is long past when they should have finished work. It is important to be strict with yourself and finish at a set time.
Distractions
In my previous office testers, managers, production and support staff were all mixed in with the developers resulting in a lot of distractions however when you are at home there are also a great number of possible distractions. It can be difficult to avoid friends and family assuming you are available during working hours to run errands. I find I need to carefully budget time to such tasks and take it out of my working time like i was actually in an office.
Environment
My previous office had "tired" furniture and decoration in an open plan which often had a negative impact on my productivity. When working from home I find it beneficial to partition my working space from the rest of my life and ensure family know that when I am in that space I am unavailable. You inevitably end up spending a great deal of time in this workspace and it can have a surprisingly large effect on your productivity.
Being confident I was aware of what I was letting myself into I knew I required a suitable place to work. In our previous home the only space available for my office was a four by ten foot cellar room with artificial lighting. Despite its size I was generally productive there as there were few distractions and the door let me "leave work" at the end of the day.

Garden office was assembled June 2017
This time my resources to create the space are larger and I wanted a place I would be comfortable to spend a lot of time in. Initially I considered using the spare bedroom which my wife was already using as a study. This was quickly discounted as it would be difficult to maintain the necessary separation of work and home.

Instead we decided to replace the garden shed with a garden office. The contractor ensured the structure selected met all the local planning requirements while remaining within our budget. The actual construction was surprisingly rapid. The previous structure was removed and a concrete slab base was placed in a few hours on one day and the timber building erected in an afternoon the next.

Completed office in August 2018
The building arrived separated into large sections on a truck which the workmen assembled rapidly. They then installed wall insulation, glazing and roof coverings. I had chosen to have the interior finished in a hardwood plywood being hard wearing and easy to apply finish as required.

Work desk in July 2017
Although the structure could have been painted at the factory Melodie and I applied this ourselves to keep the project in budget. I laid a laminate floor suitable for high moisture areas (the UK is not generally known as a dry country) and Steve McIntyre and Andy Simpkins assisted me with various additional tasks to turn it into a usable space.

To begin with I filled the space with furniture I already had, for example the desk was my old IKEA Jerker which I have had for over twenty years.

Work desk in August 2018
Since then I have changed the layout a couple of times but have finally returned to having my work desk in the corner looking out over the garden. I replaced the Jerka with a new IKEA Skarsta standing desk, PEXIP bought me a nice work laptop and I acquired a nice print from Lesley Mitchell but overall little has changed in my professional work area in the last year and I have a comfortable environment.

Cluttered personal work area
In addition the building is large enough that there is space for my electronics bench. The bench itself was given to me by Andy. I purchased some inexpensive kitchen cabinets and worktop (white is cheapest) to obtain a little more bench space and storage. Unfortunately all those flat surfaces seem to accumulate stuff at an alarming rate and it looks like I need a clear out again.

In conclusion I have a great work area which was created at a reasonable cost.

There are a couple of minor things I would do differently next time:
  • Position the building better with respect to the boundary fence. I allowed too much distance on one side of the structure which has resulted in an unnecessary two foot wide strip of unusable space.
  • Ensure the door was made from better materials. The first winter in the space showed that the door was a poor fit as it was not constructed to the same standard as the rest of the building.
  • The door should have been positioned on the end wall instead of the front. Use of the building showed moving the door would make the internal space more flexible.
  • Planned the layout more effectively ahead of time, ensuring I knew where services (electricity) would enter and where outlets would be placed.
  • Ensure I have an electrician on site for the first fix so electrical cables could be run inside the walls instead of surface trunking.
  • Budget for air conditioning as so far the building has needed heating in winter and cooling in summer.
In essence my main observation is better planning of the details matters. If i had been more aware of this a year ago perhaps I would not not be budgeting to replace the door and fit air conditioning now.

Wednesday 1 August 2018

Irony is the hygiene of the mind

While Elizabeth Bibesco might well be right about the mind software cleanliness requires a different approach.

Previously I have written about code smells which give a programmer hints where to clean up source code. A different technique, which has recently become readily available, is using tool-chain based instrumentation to perform run time analysis.

At a recent NetSurf developer weekend Michael Drake mentioned a talk he had seen at the Guadec conference which reference the use of sanitizers for improving the security and correctness of programs.

Santizers differ from other code quality metrics such as compiler warnings and static analysis in that they detect issues when the program is executed rather than on the source code. There are currently two  commonly used instrumentation types:
address sanitizer
This instrumentation detects several common errors when using memory such as "use after free"
undefined behaviour sanitizer
This instruments computations where the language standard has behaviour which is not clearly specified. For example left shifts of negative values (ISO 9899:2011 6.5.7 Bit-wise shift operators)
As these are runtime checks it is necessary to actually execute the instrumented code. Fortunately most of the NetSurf components have good unit test coverage so Daniel Silverstone used this to add a build target which runs the tests with the sanitizer options.

The previous investigation of this technology had been unproductive because of the immaturity of support in our CI infrastructure. This time the tool chain could be updated to be sufficiently robust to implement the technique.

Jobs were then added to the CI system to build this new target for each component in a similar way to how the existing coverage reports are generated. This resulted in failed jobs for almost every component which we proceeded to correct.

An example of how most issues were addressed is provided by Daniel fixing the bitmap library. Most of the fixes ensured correct type promotion in bit manipulation, however the address sanitizer did find a real out of bounds access when a malformed BMP header is processed. This is despite this library being run with a fuzzer and electric fence for many thousands of CPU hours previously.

Although we did find a small number of real issues the majority of the fixes were to tests which failed to correctly clean up the resources they used. This seems to parallel what I observed with the other run time testing, like AFL and Valgrind, in that often the test environment has the largest impact on detected issues to begin with.

In conclusion it appears that an instrumented build combined with our existing unit tests gives another tool to help us improve our code quality. Given the very low amount of engineering time the NetSurf project has available automated checks like these are a good way to help us avoid introducing issues.

Friday 1 June 2018

You can't make a silk purse from a sow's ear

Pile of network switches
I needed a small Ethernet network switch in my office so went to my pile of devices and selected an old Dell PowerConnect 2724 from the stack. This seemed the best candidate as the others were intended for data centre use and known to be very noisy.

I installed it into place and immediately ran into a problem, the switch was not quiet enough, in fact I could not concentrate at all with it turned on.

Graph of quiet office sound pressure
Believing I could not fix what I could not measure I decided to download an app for my phone that measured raw sound pressure. This would allow me to empirically examine what effects any changes to the switch made.

The app is not calibrated so can only be used to examine relative changes so a reference level is required. I took a reading in the office with the switch turned off but all other equipment operating to obtain a baseline measurement.

All measurements were made with the switch and phone in the same positions about a meter apart. The resulting yellow curves are the average for a thirty second sample period with the peak values in red.

The peak between 50Hz and 500Hz initially surprised me but after researching how a human perceives sound it appears we must apply the equal loudness curve to correct the measurement.

Graph of office sound pressure with switch turned onWith this in mind we can concentrate on the data between 200Hz and 6000Hz as the part of the frequency spectrum with the most impact. So in the reference sample we can see that the audio pressure is around the -105dB level.

I turned the switch on and performed a second measurement which showed a level around the -75dB level with peaks at the -50dB level. This is a difference of some 30dB, if we assume our reference is a "calm room" at 25dB(SPL) then the switch is causing the ambient noise level to similar to a "normal conversation" at 55dB(SPL).

Something had to be done if I were to keep using this device so I opened the switch to examine the possible sources of noise.

Dell PowerConnect 2724 with replacement Noctua fan
There was a single 40x40x20mm 5v high capacity sunon brand fan in the rear of the unit. I unplugged the fan and the noise level immediately returned to ambient indicating that all the noise was being produced by this single device, unfortunately the switch soon overheated without the cooling fan operating.

I thought the fan might be defective so purchased a high quality "quiet" NF-A4x20 replacement from Noctua. The fan has rubber mounting fixings to further reduce noise and I was hopeful this would solve the issue.

Graph of office sound pressure with modified switch turned on
The initial results were promising with noise above 2000Hz largely being eliminated. However the way the switch enclosure was designed caused airflow to make sound which produce a level around 40dB(SPL) between 200Hz and 2000Hz.

I had the switch in service for several weeks in this configuration eventually the device proved impractical on several points:

  • The management interface was dreadful to use.
  • The network performance was not very good especially in trunk mode.
  • The lower frequency noise became a distraction for me in an otherwise quiet office.

In the end I purchased an 8 port zyxel switch which is passively cooled and otherwise silent in operation and has none of the other drawbacks.

From this experience I have learned some things:

  • Higher frequency noise (2000Hz and above) is much more difficult to ignore than other types of noise.
  • As I have become older my tolerance for equipment noise has decreased and it actively affects my concentration levels.
  • Some equipment has a design which means its audio performance cannot be improved sufficiently.
  • Measuring and interpreting noise sources is quite difficult.