Google, get your paws off my kids!

What I see is a naive education system being led astray by a hugely influential data harvesting company. Leaving aside my huge reservations about giving Google access to my kids, to harvest their data and keep it for their entire lives, I’ll simply talk about one tiny issue that I see as the first hurdle they fell at. The very first thing I looked into regarding Google in the Classroom set off alarm bells.

They partner with Kahoot, a Google Education Partner, Google Classroom. Even at a glance, Kahoot looks to me like a shady company. The Common Sense Privacy Report sort of backs up my gut feeling, 66% with an amber ‘Warning’ badge is way below the standard I expect in my child’s education.

As an example of what looked shady to me, if I search Google for ‘Kahoot Data Harvesting’ I get the following URL: –

(http://) ( (/kahoot-collecting-data.html)

I have broken up the above URL because I don’t want to add to its value by linking to it. That’s marketing spam! Worse still, it has been deliberately designed to shadow legitimate questions being asked about ‘Kahoot data harvesting’. Instead of talking about privacy issues, the article purports to be advising teachers on how best use Kahoot in classrooms. There is no doubt in my mind that this is driven by Kahoot. It’s shady and it’s deceptive.

The entire first page of search results, even on non-Google search engines, is either from the Kahoot website directly, or ‘shadowed’ content from them that’s masquerading as ‘articles’.

Straight off the bat, I do not trust this company with access to my child, so why should schools be pushing this?

The subject of data exploitation and privacy abuse is huge. So big, so insidious, and only so widely accepted because of public naivety, which is pretty much unavoidable due to the complexities involved. It’s impossible to convey in a few words, or even a few sentences or paragraphs. However, since I originally wrote this, a documentary team have had a good stab at the issue, so…

Here’s some Unsolicited AdviceTM

Start by watching The Social Dilemma – it’s a documentary with some dramatisation, and it provides a gentle introduction to the subject. If you don’t have Netflix, then find someone you know who does, and go round with sandwiches and a cake to enjoy while watching it together. If you don’t know anyone with Netflix, then ‘pirate’ it with BitTorrent!

I think it is crucial that we all lose our naivety, and make decisions and demands based on sound rational judgement. The revelations in The Social Dilemma are already quite well understood within many sectors of the tech industry. Lots of people have been warning about this stuff for years, but it’s all so opaque, abstract, and complex, all at once, that unless you are immersed in the tech world, it’s an uphill struggle to properly convey. And Netflix, to their credit, have managed to do a really good job – I’m genuinely surprised that they got away with it.

Enriching, or just engaging? – Why advertising destroys your web.

Advertising is destroying the web, some say it is already destroyed. Most people who work in the tech industry know this, but only because we get to see the web evolve. For everyone else, it’s hard to get any visibility into what’s going on, so I’ll try to explain it.

Every time you visit a web page, absolutely any page, there’s an opportunity to make money. That’s because when you’re looking at a screen, you could also be looking at an advert. Advertisers will pay money to have their ad on your screen at that moment.

The ad-tech industry is complex, but most of that complexity is internal to the industry. Really, there’s just the advertiser, the publisher, and the reader. The publisher creates interesting stuff, and we read it. Advertisers who want to reach people like us will pay money get their message in front of our eyes.

On the web, there are billions of people with web browser, and every page-view is opportunity to make a tiny amount of money – it all adds up. An advertiser somewhere will be willing to pay some tiny sum for even the tiniest chance of a profit.

If you think email spam is bad, then realise it represents the lower bound of internet advertising. The worst web advertising is little better than email spam, it has no meaningful lower bound.

To meet this unbounded demand from advertisers, internet publishers need to produce content to satisfy that demand. In principle, good content attracts many interested readers (page-views), and therefore lots of advertising revenue.

In practice however, publishers have realised that good content is not really necessary, and that more money can be made from ‘seemingly’ good content or ‘otherwise tempting’ content. In essence, it’s cheaper to buy or copy existing content, and it’s enough to just write an effective headline. They get paid as you look at their page, there’s little value in your opinion of what they publish.

In other words, publishers profit more from syndicated content and click-bait than they do from creating original, informed, and objective writing.

noun: clickbait – content whose main purpose is to attract attention and encourage visitors to click on a link to a particular web page.

Oxford University Press

Historically, the model used in magazines had a targeted advertising component. These would be sold at a cover price, and further revenue would come from advertising. The magazine articles would have to cater to our interests, or we’d simply stop buying the magazine. There was a strong incentive to not mess with readers, though ‘adutorial content’ (adverts disguised as articles) was rife in the more general interest publications.

Today, there are comparatively vast numbers publishers, all competing to get you to look at their pages, because as well as traditional publishers, there are now also bloggers and social media ‘influencers’. There are even sites that make no effort to conceal their motives, and produce nothing but click-bait headlines, carefully calculated to result in your click, that lead nowhere substantial. In fact many established publishers also do exactly this.

This is unavoidable where the goal is to make money in an environment with no enforced rules, where customer loyalty no longer exists, and where tiny payments accumulate simply by getting you to look at something. Inevitably, sites offer as little as they can get away with, while still getting you to click their links.

“I’ve noticed that a lot of news articles take a few examples of something outrageous, but never say anything about how widespread the practice is.

bleah1000 observing examples of sensational click-bait (one of its many forms).

So, who controls what gets in front of your eyes? It’s whoever controls your starting point on the web. When you launch a web browser, what’s the first thing you see? When you open your phone, what notifications are waiting for you? Who is asking you to install their app ‘for a better experience’? What do you see in these apps?

The web has changed from an enriching experience to a frantic, exploitative clamour for your eyeballs on some advertiser’s copy.

As an aside, you can improve your experience to some extent by using either Firefox + uBlock Origin, or Brave (which is Google Chrome with ad-block built in). The industry is already gunning for these products, because they conflict with commercial greed. However, for the time being they’re available to use and they’re effective.

A/B Testing and Pursuit of ‘Engagement’

Here’s how to exploit people’s precious time and attention. Show a million people variation A, and another million people variation B. Which variation won – which got the highest ‘engagement’? Vary the best one slightly and repeat. Forever!

That’s how to hone in on what works best with your audience – at any scale, at any level of detail – it’s the kind of thing that traditional market researchers drool over – it’s totally beyond their wildest dreams.

I used the word ‘engagement’, yet the web should really be bringing you ‘enrichment’. Engagement is how the ad-tech industry measures performance – in other words, how long did you spend looking at their stuff.

An ‘enriching’ experience really doesn’t figure in their reasoning about any of this. Only engagement counts. There may be co-incidental overlap – an enlightening bit of comedy, or some well-explained science might leave you enriched in some small way, but these discoveries happen in spite of the system rather than because of it.

The very notion of ‘engagement’ as a measure is intimately tied to the ideas around ‘tracking’. You have a unique identity, such as email, your Facebook/Insta/TikTok/whatever name, your phone number. You also have many more unique identifiers that you will probably be oblivious to, such as your browser’s ‘fingerprint’, tracking cookies, and a whole slew of more technical stuff.

These unique identifiers allow the industry to measure your engagement, and without them it’s much harder to exploit your attention.

The next time you pick up your phone, or tablet, or laptop, ask yourself if the experience is ‘enriching’, or just ‘engaging’, and take a moment to consider just putting the device down.

Gemini Protocol & Markup

The Gemini project defines a protocol and markup for serving documents. The server runs on port 1965 by default, it responds to gemini:// URLs, it uses TLS with client certificates, and it has a simple markdown that fixes the shortcomings of Gopher pages without adding much extra. There are GUI, terminal, and command line clients already available.

My own Gemini server is at gemini:// At least one web-based proxy has been made available by some kind soul, but I recommend using a proper client to get the full ‘other-worldliness’ experience. I’m only half-kidding here, because the separation of protocol makes Gemini potentially quite valuable – it’s a hostile environment for the crap that has devalued HTTP.

Seriously, not even inline images are supported. You have to choose to download an image, just like any other link, and that’s a matter of principle in the protocol design. Naturally, this is one of the things that will prevent it from ever evolving in the same way as the web, which is kind of the whole point. The principle is that you are in control of what you view.

So, if you want to publish stuff somewhere that’s separate from the world wide web, then this seems like a good alternative. The markdown is simple, yet gives headings, paragraphs, pre-formatted blocks, lists, and links. You literally type some words in a text file, and that’s your document ready to serve to the world. The immediacy and accessibility is refreshing (more so as I type this in WordPress, which is lagging in a browser on a fast 8-core Intel i7 laptop with 16GB of RAM).

There are barriers to entry – the most direct access for editing is via a shell, using a command line editor (I use Vim). I think this is well within the abilities of most people, but it will likely be daunting. The tilde communities seem to do a good job of making these things accessible, perhaps some of this could be applied to the fledgling Gemini community.

Below is a summary, but the full specification is at (it’s brief; it can currently be read in the time it takes to drink a coffee!)

# This would be an H1
## This would be an H2
### This would be an H3

=> Click me, I always exist on my own line.

> This line will show as a block-quote

* This is the first list item.
* This is another list item.

Pre-formatted blocks start and end with lines containing three back-ticks (```)

Anything else is just normal paragraph text, blank lines are always preserved.

The simplest Gemini client is Bash

Using ncat, it’s possible to make a request to a Gemini server with the following: –

echo -ne 'gemini://\r\n'|ncat --ssl 1965

This results in the following on stdout. The first line is the server’s response code, the rest is my index.gmi content: –

20 text/gemini
Hello from Gemini dot Susa dot Net

=> lua_api.txt The Official Minetest Lua API for modding.
=> lua_api.gmi The Official Minetest Lua API in Gemini markup.
=> xtract_lua_api.awk My awk script to extract sections of the API.

=> petz_functions.gmi a list of member functions in the Petz mod.
=> minetest_pointed_thing.gmi my explanation of above and under nodes.

=> hexchat_no_such_device_or_address.gmi Fix HexChat Disconnected (No such device or address)

My web site is at: -


I can be reached via SMTP at

So simple, this is a great initiative; decisions seem to have been well considered. I’m grateful! For more, see

A more featureful client: GemiNaut

A gemini (.gmi) page viewed in the GemiNaut browser.

For a server, I use agate gemserv:

I have tried agate, which I think serves static files for a single host. I then spent ages getting GLV-1.12556 running, which has lots of well considered features, but it required a lot of fiddling and hacking to get it to compile and run, so it might be too much of an admin hassle.

I have currently switched to gemserv, which seems to have similarly useful features (virtual hosts, CGI, home directories), but runs as a single binary like agate, and the Rust project builds easily with cargo. There are a couple of links to CGI examples at gemini:// if you want to see the server in action.

My server needs a private key and certificate (just issued locally using openssl).

# This command created a key and certificate that's valid for any host on
openssl req -x509 -newkey rsa:4096 -keyout key.rsa -out cert.pem -days 3650 -nodes -subj "/CN=*"

# This command lets you view your server's certificate as human-readable text for troubleshooting.
openssl x509 -in cert.pem -text

The following is some server output, including the command line that launched the server (in this case agate).

kevin@kakapo:~/gemini$ ./agate.x86_64-unknown-linux-gnu ./content/ cert.pem key.rsa
Got request for "gemini://"
Got request for "gemini://"
Got request for "gemini://"
Got request for "gemini://"
Got request for "gemini://"
Error: Os { code: 2, kind: NotFound, message: "No such file or directory" }
Got request for "gemini://"
Got request for "gemini://"
Error: Os { code: 2, kind: NotFound, message: "No such file or directory" }
Got request for "gemini://"

My index.gmi file to show a little markdown.

Hello from Gemini

=> petz_functions.gmi a list of member functions in the Petz mod.

You may check out my website at: -

-- Lua code to log my name minetest.log("Kevin")
Kevin said:
> This is quite a useable format to document Minetest mods. I think I'll use this in future.

### I'm signing off now
Good day to you sir. I bid you farewell.

If you have a gemini client, you can view this at gemini://

Personal Search Engine

I’ve been playing with the idea of Commerce Filtered Search, which is essentially a search engine of stuff that doesn’t exist simply to get you to click it. Sort of like how the Internet used to be, a place for those with boundless curiosity by those who just like to share what interests them.

My original plan was to ‘seed’ a web crawler with high-quality links, and fan out from there. The seed links were extracted from Hacker News, Reddit was on my radar. I also wrote an add-on for Firefox that lets me record links that I happen to find for subsequent crawling. The resulting search engine (with an outdated index) can be found here. I find it hugely promising, but each iteration seems to yield more complexity – search is hard.

So, I’m paring back. My goal right now is for a personal search engine that I can throw URLs from various good sources and have them fed into the index. I’m keeping a record of useful tools to help with this, and this page is currently a repository of information that should evolve into a more coherent post.

Interesting stuff to consider:

Sonic describes itself as “Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.”

This seems promising because it claims to work well with limited resources, which is something I’d trade some of Lucene’s flexibility for.

Linkalot is a basic (at the time of writing) self-hosted link manager, and can be used to accumulate interesting links. It works well with Send-Tab-URL extension.

jusText is used for boilerplate removal. It has many implementations and forks of the original unmaintained algorithm. You will likely find things like cookie consent and image captions among your text.

spaCy could possibly be used to add a bit of semantic reasoning or keyword extraction to give the index more meaningful information to store.

readability-cli is a library based on the Firefox Reader Mode code that provides a command to fetch and simplify pages to just the article text. As you’d expect, it works as well as Firefox Reader Mode, which is to say it works really well.

Preliminary report suggests 30% Alcohol effective against SARS-CoV-2 (Coronavirus)

Warning: I am neither a biologist, nor a journalist. My reading of this is naive, and is posted simply to draw attention to a study by seemingly reputable scientists. Seek qualified opinions, this may be nonsense.

Please include a link to this post of you pass this information on. To those people who misreport things for the purpose of click-bait, please don’t!

There is a preliminary report, posted on 17th March, which seems to find that lower concentrations of alcohol are effective against SARS-CoV-2.
I’m not qualified to have a meaningful opinion on this, but I’ll post it here to see if anyone can shed more light on it.

Notably, both tested alcohols, ethanol and 2-propanol were efficient in inactivating the virus in 30s at a minimal final concentration of at least 30%

Points to bear in mind:

  • This is only one study.
  • I may be misinterpreting the results.
  • The study was conducted in laboratory conditions, rather than the real world.
  • The study has not yet been peer-reviewed.

The implication, if true, is that alcohol such as Vodka or Rum could be used as a sanitiser for hands and surfaces, particularly relevant given the current shortages of pharmaceutical and industrial products.

All advice is that hand-washing with soap is the most effective way to destroy SARS-CoV-2. However, where running water is not readily available, an effective hand-sanitiser would be the next best thing.

My own approach is currently to act on the findings in this study, but as a last resort.

Hand-washing with soap is, by far, my first choice. A hand-sanitiser that meets WHO standards is my second choice.

If neither of these are available to me, then I’m using a small spray bottle filled with some 40% alcohol (by volume).

The nghttpx Reverse Proxy

I want to expose different containers on specific URL paths, possibly on different hosts, and nghttpx, from the nghttp2 library by Tatsuhiro Tsujikawa, does this in an intuitive way, and does a lot more besides.

    sudo apt-get install nghttp2-proxy

An example configuration is installed in /etc/nghttpx/nghttpx.conf, which configures nghttpx to listen for cleartext http on port localhost:3000 and proxies (e.g. forwards to) localhost:80.

Continue reading

uBlock Origin: exporting the blocked hosts

I’ve been working on and off on the Commerce Filtered Search Engine (CFS), essentially a survey of the web to find sites that are not commercially driven, in order to index them for searching. The idea is that if we can filter out all the click-bait and commercial stuff, what’s left might actually be interesting, novel, and informative (and a fair bit of rubbish, I expect, but perhaps it’ll at least be honest and sincere rubbish).

Up until now, I’ve been using Puppeteer with uBlock Origin. I was able to handle request failures and check for the error net::ERR_BLOCKED_BY_CLIENT, which indicates that a request was deemed to be ad or tracking related. The more hits per page I survey, the more spammy I rank the site.

Continue reading

Longan Nano (GD32VF103)

This was an impulse purchase, because for some unfathomable reason I really wanted a RISC-V CPU to play with. Sipeed’s Longan Nano, a small board based on the GigaDevice GD32VF103 SoC is just that; a RISC-V CPU with a bundle of decent peripherals. There are links below if you want details on this board.

The GD32V implements an RV32IMAC CPU, where ‘RV32I’ refers to a 32-bit CPU with the Base Integer Instruction Set, the ‘M’ denotes the Standard Extension for Integer Multiplication and Division, the ‘A’ denotes the Extension for Atomic Instructions, and the ‘C’ refers to the Extension for Compressed Instructions (i.e. 16-bit opcodes for commonly used instructions, useful for this memory constrained device).

Continue reading

Seeed, 4PX, and Yodel – Really Quite Good

So Seeed finally dispatched my Sipeed Longan Nano, a little RISC-V SoC that was too interesting to resist. I say finally not because they were tardy, but because it was on back order. I had to write a few words about the delivery, because while I’ve always been impressed by Chinese suppliers, the ability to track with the level of detail shown below, from Shenzhen to just south of Edinburgh, is particularly impressive.

Here’s the tracking from Shenzhen, China to Livingson, Scotland

Continue reading

LXD eMail, SMTP/IMAP/WebMail with OpenSMTPD, Dovecot, and Roundcube.

Email is one of those conceptually simple things that are a lot more complex in practise – get it wrong and you miss incoming mail, or your mail gets lost or junked, or spammers exploit your server.

This post is intended for technical people who want to run their own personal mail server, and describes the steps required to get a basic server setup that can be run safely and reliably.

Continue reading

VirtualBox VMDK for Raw Disk Access on a Windows Host

Do not act on this article unless you are prepared to trash your disks, or if you are absolutely sure you understand what you are doing. Messing with raw disk sectors is risky!

VirtualBox allows us to use a disk device directly, rather than using a file as a virtual volume. For me, since I have two SSDs in my laptop, it meant I could tinker with virtual machines without risking my Windows 7 partition, while also being able to boot the VMs on real hardware if I wanted.

Continue reading

Arduino Yun Reading WH1080 using AUREL RX-4MM5

Here’s the sketch, it just reads and dumps to the console, the bridge can be used to send the data to the GNU/Linux side of the Yun.

See the other post on doing this with a Raspberry Pi for some code to turn the data into something useful.

I’m using the MCU of the Yun to do the RF stuff, and using the AUREL RX-4MM5 (a proper OOK receiver), it seems a lot more dependable than the Raspberry Pi + RFM01 (or RFM12B).

Continue reading

Raspberry Pi reading WH1081 weather sensors using an RFM01 and RFM12b

This article describes using an RFM01 or RFM12b FSK RF transceiver with a Raspberry Pi to receive sensor data from a Fine Offset WH1080 or WH1081 (specifically a Maplin N96GY) weather station’s RF transmitter.

I originally used the RFM12b, simply because I had one to hand, but later found that the RFM01 appears to work far better – the noise immunity and the range of the RFM01 in OOK mode is noticeably better.  They’re pin compatible, but the SPI registers differ between the modules, in terms of both register-address and function.

This project is changing to be microcontroller based, and using an AM receiver module (Aurel RX-4MM5) – a much more effective approach – arduino-yun-reading-wh1080-using-aurel-rx-4mm5. Currently testing on Arduino Yun, but will probably move to a more platform agnostic design to support Dragino and Carambola etc.

Continue reading

LXD now runs my WordPress

Here are some notes on how I used LXD to run a container for WordPress. This is (a lot) more convenient than using Docker, which was my original approach to getting my WordPress site into a container. The main advantage for me is that a single container runs all the components together – no need for the ‘wiring’ between containers for each process.

There is a bash script that automates this at, and is a more complete description of the process since it automatically configures SSL/TLS and Exim.

Continue reading

PIC/MOSFET PWM Model Train Controller

Having been unable to resist buying some old Hornby OO Gauge bits from the second hand cabinet in a model shop, justification came from the educational value it would offer my son if I could make a speed controller, perhaps adding a sensor or two – the essence of industrial control and feedback mechanisms. Being three and a half, he just wanted to make the train fly off the track, but at least he enjoyed it.

This is a project to create a model train speed controller using the Pulse Width Modulation (PWM) output of a PIC16F690 microcontroller, to drive a MOSFET that ultimately controls the voltage on the tracks. The train will automatically switch into reverse when the control is turned anti-clockwise through the zero point. Continue reading

Braun ThermoScan Fix – Low Battery Warning Switch Off

We have a Braun Thermoscan infra-red (IR) thermometer that has been working perfectly for about five years. It started complaining about low batteries and shutting off, despite me replacing with new batteries that I checked had plenty of charge.

When I opened it, I discovered that the batteries connect to the circuit board via simple metal clip contacts, and that the contacts had some corrosion on them, which was preventing power from getting to the board, hence why it was complaining of low batteries.

So a very simple fix is to just clean the corrosion from the battery terminals inside the thermometer. You’ll need a Torx T9 screwdriver (Maplin, eBay, Amazon, maybe pound shops).
Continue reading

Raspberry Pi Power Controller

This article is a work in progress to create a power-controller for the Raspberry Pi based on a PIC microcontroller and MOSFET. The PIC implements an I2C slave to allow power control, and also to approximate the registers of a PCF8563 Real Time Clock (RTC) chip, to allow timed wake-up of the Pi.

  • Power the Raspberry Pi off and on with a push-button.
  • Fully shut down the Raspberry Pi on ‘shutdown -h’.
  • Wake-up at a specified time (one-off or periodic).
  • Monitor the supply voltage.
  • Log glitches in the power-supply (e.g. caused by USB device activity).
  • Maintains the time from a CR2032 button cell.

During power-down, the circuit currently consumes around 5μA of power, useful where a battery is being used to power the Pi (remote solar-power applications, or in-car systems, for example).

The Pi is able to instruct the PIC to power it down using a short I2C command sequence. Wake up events include a push-button, or other voltage-sense on an input pin. Continue reading

Raspberry Pi – Driving a Relay using GPIO

There’s something exciting about crossing the boundary between the abstract world of software and the physical ‘real world’, and a relay driven from a GPIO pin seemed like a good example of this. Although a simple project, I still learned some new things about the Raspberry Pi while doing it.

There are only four components required, and the cost for these is around 70p, so it would be a good candidate for a classroom exercise. Even a cheap relay like the Omron G5LA-1 5DC can switch loads of 10A at 240V. Continue reading

Fighting Click-Bait

The Internet seems awash with ‘click-bait’ and sponsored content – articles created primarily to generate money, sometimes plagiarised, misleading, exaggerated, or provocative just to get views. The good stuff – articles often written simply because it’s good to share knowledge and ideas – is getting harder to find.

My proposal is to create a search engine that, rather than systematically crawl the web, starts with a seed corpus of high quality links, and fans out from there, stopping when the quality drops. The result will hopefully be a searchable index of pages that were created to impart information rather than to earn cash from eyeballs.

Continue reading

Docker WordPress in a subdirectory

Moving a standard WordPress installation to a different host is a minor pain – I only do this occasionally, so every time I need to consider the configuration of the original environment and how this translates to the new server. Nothing too challenging, but tedious and prone to error.

So I figured Docker containers are the way to go and, sure enough, Docker Hub has more than enough images for my needs. The only issue is that I don’t dedicate my server to WordPress – it’s in a ./wordpress subdirectory of the web root. Docker’s official WordPress image keeps reinstating the WordPress files if they’re not found in the web root. Continue reading

Atech Postal – notes on the Fast Server

Atech’s Postal is an SMTP server and web management interface that’s geared towards transactional and bulk mailing (e.g. for application to user communication, and for marketing respectively). It’s quite well documented, but more importantly it’s open source (MIT license), and also seems well written – elegant, self-documenting code that’s easy to follow, useful comments, well structured. A bit of a joy really.

The Fast Server is a web server process that’s separate from the management interface server, that’s used to handle requests from click and open tracking links. However, the documentation on the Fast Server process, which is used for logging email Open and Click events, seems to be at least partially out of date, so I thought I’d dig into the code to understand and document the bits that I was unsure of. Continue reading

Raspberry Pi GPFSEL, GPIO, and PADS Status Viewer

The gpfsel_list (I maybe should have called it lsgpio) utility displays a list of the currently configured function selections across all available GPIO pins and, for pins configured as GPIO, the current state of the pins. For pins configured with ALTn functions, the selected function is listed according to the datasheet information.

It also shows the state of the PADS registers to display the configured drive current, hysteresis, and slew setting for the three groups of pins (GPIO 0-27, 28-45, and 46-53).

It’s been written to produce output that’s easy to grep and cut, and performs only read operations on the registers – it can’t be used to modify settings, though I suppose this could change in future.

Continue reading