The MOS 6502’s Parallel Binary/BCD Adder patent expired today!

The patent for the 6502’s ALU adder circuits expired today, by chance the same day I wrote a BCD adder in Verilog to go with a toy 6502ish CPU I am designing. So, I thought I’d re-implement it loosely based on the MOS solution.

U.S. Patent Nov. 9, 1976 Sheet 1 of 3 3,991,307

The following schematic is unlikely to be needed, but might shed some light on the details. The scans have been pasted together, but don’t quite align towards the bottom. Some squinting will be required to make sense of it. There are a few lines that look unconnected, which I assume are just separators.

Schematic of the Parallel Binary/BCD of the MOS 6502 from patent US3991307

Roughly, the schematic is formed by three columns, the left side defines the binary adder, the middle shows the decimal carry calculation, and the right side generates the decimal correction.

The patent text really just describes the gates implemented above. However, it talks about them logically, rather than referring to the actual gates used, so unless you can apply De Morgan’s’ laws as you read (I can’t!), then it will be slow going trying to figure it all out.

The C code below is a behavioural implementation of the circuits described in the patent. It doesn’t overlap exactly the diagram in Fig.1, e.g. the carry flags are generated in the adder module rather than being derived separately using gates.

The code in the main() block generates test data output for the adder, the decimal corrector, and then both parts combined into two 4-bit adders.

#include <stdio.h>
#include <stdbool.h>

/*
 * The Parallel Binary Adder - (4-bit) module.
 */
void pbAdder(
        unsigned char A,        // 4-bit nibble
        unsigned char B,        // 4-bit nibble
        bool Cin,               // Carry in
        unsigned char *BS,      // Binary Sum
        bool *BC,               // Binary Carry
        bool *DC)               // Decimal Carry
{
        // Sanitise 8-bit values to 4 bits.
        A = A & 0xf;
        B = B & 0xf;

        *BS = (A + B + Cin) & 0xf;
        *BC = ((A + B + Cin) & 0x10) != 0;
        *DC = (A + B + Cin) > 9;

        printf("A=%d, B=%d, BS=%d, BC=%d, DC=%d\n", A, B, *BS, *BC, *DC);
}

/*
 * Decimal correction gating, generating DS (final ACC output)
 */
void dcGating(
        unsigned char BS,       // 4-bit Binary Sum
        bool DAA,               // Decimal Addition flag
        bool DSA,               // Decimal Subtraction flag
        unsigned char *DS)      // Decimal Sum (or Binary pass-thru)
{
        bool BS0 = (BS & 1);
        bool BS1 = (BS & 2) >> 1;
        bool BS2 = (BS & 4) >> 2;
        bool BS3 = (BS & 8) >> 3;

        bool DS0 = BS0;
        bool DS1 = (DAA | DSA) ^ BS1;
        bool DS2 = ((DSA & BS1) | (DAA & ~BS1)) ^ BS2;
        bool DS3 = ( ((BS1 | BS2) & DAA) | (~BS1 | ~BS2) & DSA) ^ BS3;

        printf("BS: %d%d%d%d\t", BS3, BS2, BS1, BS0);
        printf("DS: %d%d%d%d\n", DS3, DS2, DS1, DS0);

        // Select DS if resulting decimal is 0xa thru 0xf
        *DS = (DAA | DSA) & BS3 & (BS1|BS2)
                ? DS3 << 3|DS2 << 2|DS1 << 1 | DS0
                : BS3 << 3|BS2 << 2|BS1 << 1 | BS0;
        printf("ACC OUT: %02X\n", *DS);
}

void main()
{
        // Test some 4-bit addition and carry outputs.
        unsigned char sum;
        bool BC3;       // Binary Carry from adder bit 3
        bool DC3;       // Decimal Carry from adder bit 3

        // Test data 1 - mostly boundary cases. {A, B, Cin}
        unsigned char td1[][3] = { {1,2,0}, {8,1,0}, {8,1,1},
                {9,1,0}, {9,1,1}, {14,1,0}, {14,1,1} };
        int num_td1 = sizeof(td1) / sizeof(*td1);

        int i;
        for(i=0; i < num_td1; i++) {
                pbAdder(td1[i][0], td1[i][1], td1[i][2],
                        &sum, &BC3, &DC3);
        }

        // Enumerate all adder outputs, with decimal corrections.
        unsigned char accumulator;

        for(i=9; i < 16; i++) {
                printf("\n");

                dcGating(i, 1, 0, &accumulator);
        }
        printf("\n");

        // Test combination of two 4-bit adders

        unsigned char td2[][5] = {
                {0, 15, 0, 0, 0},
                {240, 15, 0, 0, 0},
                {0x98, 1, 0, 1, 0},
                {0x98, 1, 1, 1, 0},
                {0x98, 2, 0, 1, 0},
                {0x99, ~2, 1, 0, 1},
                {0x99, ~0x22, 1, 0, 1},
                {100, ~2, 1, 0, 0}
        };
        int num_td2 = sizeof(td2) / sizeof(*td2);

        for(i=0; i < num_td2; i++) {

                unsigned char A=td2[i][0], B=td2[i][1];
                bool Cin = td2[i][2];
                bool DAA=td2[i][3], DSA=td2[i][4];
                unsigned char sumL, sumH;
                unsigned char ACC;

                printf("A:%02X (%d), B:%02X (%d), Cin:%d\n",
                        A, A, B, B, Cin);

                pbAdder(A&0xf, B&0xf, Cin, &sumL, &BC3, &DC3);
                dcGating(sumL, DAA, DSA, &sumL);
                pbAdder((A&0xf0)>>4, (B&0xf0)>>4, DAA&DC3|BC3,
                        &sumH, &BC3, &DC3);
                dcGating(sumH, DAA, DSA, &sumH);

                ACC=sumH * 16 + sumL;
                printf("8-Bit OUT: 0x%02X (%d) BC=%d, DC=%d\n\n",
                        ACC, ACC, BC3, DC3);
        }
}

A:00 (0), B:0F (15), Cin:0
A=0, B=15, BS=15, BC=0, DC=1
BS: 1111        DS: 1111
ACC OUT: 0F
A=0, B=0, BS=0, BC=0, DC=0
BS: 0000        DS: 0000
ACC OUT: 00
8-Bit OUT: 0x0F (15) BC=0, DC=0

A:F0 (240), B:0F (15), Cin:0
A=0, B=15, BS=15, BC=0, DC=1
BS: 1111        DS: 1111
ACC OUT: 0F
A=15, B=0, BS=15, BC=0, DC=1
BS: 1111        DS: 1111
ACC OUT: 0F
8-Bit OUT: 0xFF (255) BC=0, DC=1

A:98 (152), B:01 (1), Cin:0
A=8, B=1, BS=9, BC=0, DC=0
BS: 1001        DS: 1111
ACC OUT: 09
A=9, B=0, BS=9, BC=0, DC=0
BS: 1001        DS: 1111
ACC OUT: 09
8-Bit OUT: 0x99 (153) BC=0, DC=0

A:98 (152), B:01 (1), Cin:1
A=8, B=1, BS=10, BC=0, DC=1
BS: 1010        DS: 0000
ACC OUT: 00
A=9, B=0, BS=10, BC=0, DC=1
BS: 1010        DS: 0000
ACC OUT: 00
8-Bit OUT: 0x00 (0) BC=0, DC=1

A:98 (152), B:02 (2), Cin:0
A=8, B=2, BS=10, BC=0, DC=1
BS: 1010        DS: 0000
ACC OUT: 00
A=9, B=0, BS=10, BC=0, DC=1
BS: 1010        DS: 0000
ACC OUT: 00
8-Bit OUT: 0x00 (0) BC=0, DC=1

A:99 (153), B:FD (253), Cin:1
A=9, B=13, BS=7, BC=1, DC=1
BS: 0111        DS: 0001
ACC OUT: 07
A=9, B=15, BS=9, BC=1, DC=1
BS: 1001        DS: 0011
ACC OUT: 09
8-Bit OUT: 0x97 (151) BC=1, DC=1

A:99 (153), B:DD (221), Cin:1
A=9, B=13, BS=7, BC=1, DC=1
BS: 0111        DS: 0001
ACC OUT: 07
A=9, B=13, BS=7, BC=1, DC=1
BS: 0111        DS: 0001
ACC OUT: 07
8-Bit OUT: 0x77 (119) BC=1, DC=1

A:64 (100), B:FD (253), Cin:1
A=4, B=13, BS=2, BC=1, DC=1
BS: 0010        DS: 0010
ACC OUT: 02
A=6, B=15, BS=6, BC=1, DC=1
BS: 0110        DS: 0110
ACC OUT: 06
8-Bit OUT: 0x62 (98) BC=1, DC=1

The output from the 8-bit operations is shown in the right-hand column. Note that the patent is based on the NMOS version of the 6502, and predates the CMOS version that included changes to the way flags get handled.

The test data provides negative numbers pre-converted to 2’s complement form, which would normally be done within the ALU.

This code was written as an intermediate step to be subsequently converted to Verilog – I’m more familiar with C, so this made sense to me. A hardware engineer would doubtless go straight to the testbench.

The Verilog code below behaviourally describes the 4 bit adder. The carry input to the next 4-bit adder is Cdec (decimal carry), though in the full implementation this would be DAA & Cdec | Cbin (decimal flag AND decimal carry, or binary carry).

/*
 * BCD Digit Adder
 * Given two 4 bit inputs and a carry, produce a BCD addition.
 */

module bcdAdder(
  input [3:0] A,
  input [3:0] B,
  input Cin,
  output [3:0] OUT,
  output Cbin,
  output Cdec
);

wire [4:0] binSum;

assign binSum = (A + B) + Cin;
assign OUT = (binSum > 9) ? binSum + 6 : binSum;
assign Cbin = binSum[4];
assign Cdec = binSum > 9;

endmodule

All those gates described in eight pages of a patent in the mid 1970s boil down to the simple behaviour above. What was arguably patentable a few decades ago is, given today’s tools, meaningless as any kind of ‘invention’. I see this as an example of why the patent system needs to keep pace with tools. What made sense in the mid 1970s is absurd today.

Anyway, I digress. Here’s the testbed code that puts two of these together to form an 8-bit BCD parallel adder, and generates signals to show typical and boundary calculations.

`timescale 1us / 1us

module bcdd_tb;

reg [7:0] A;
reg [7:0] B;
reg Cin;
wire [7:0] OUT;
wire hCbin;
wire hCdec;
wire Cbin;
wire Cdec;

initial begin
  $dumpfile("bcdd_tb.lxt");
  $dumpvars(0, bcdd_tb, A,B,Cin,OUT,hCbin,hCdec,Cbin,Cdec);

  // 1 + 7 = 8
  #0 Cin = 0;
  #0 A = 1;
  #0 B = 7;

  // 8 + 3 = 1 with C
  #1 Cin = 0;
  #0 A = 8;
  #0 B = 3;

  // 1 + 0 = 1
  #1 Cin = 0;
  #0 A = 1;
  #0 B = 0;

  // 8 + 0 + C = 9
  #1 Cin = 1;
  #0 A = 8;
  #0 B = 0;

  // 75 + 21 = 96
  #1 Cin = 0;
  #0 A = 8'h75;
  #0 B = 8'h21;

  // 98 + 1 + C = 100
  #1 Cin = 1;
  #0 A = 8'h98;
  #0 B = 8'h1;

  // 99 + 1 = 100
  #1 Cin = 0;
  #0 A = 8'h99;
  #0 B = 8'h1;

  #1 $finish;
end

bcdAdder bcdd1(
  A[3:0], B[3:0], Cin, OUT[3:0], hCbin, hCdec
);
bcdAdder bcdd2(
  A[7:4], B[7:4], hCbin|hCdec, OUT[7:4], Cbin, Cdec
);

The resulting waveforms are shown below, which includes both the nibble adders as well as the register inputs and outputs.

So, that’s good enough for now to go into the ALU of my toy 6502ish CPU. I have a mild urge to reproduce the schematic in Verilog at gate level, but I can’t face the (admittedly modest) boolean algebra needed. I’m not made of the same stuff as the folk who invented this stuff!

Something’s wrong with the Internet

Looking for a present for a four-turning-five year old who loves numbers, I figured I’d find some ideas around number cubes. Let’s see what maths cube toys I can find…

Where them cubes gone?

None in the top hits were even cube related, even those with ‘cube’ in the title did not actually refer to anything remotely cuboid in shape, nor characteristics. The irony of these search results tempers my frustration with at least a little humour. The rest of the results were no better.

I thought that Google images would give me a broader view that might help me spot something interesting.

Spot the cube.

No maths related cubes here – there are cubes that look fun, but just cubes, and there are maths toys that look at least interesting, but not really what I was looking for.

Yet I know there are businesses that seek out and carefully choose toys that are genuinely interesting and innovative. I’ve seen them in the past, I’m certain that there are lots more – why are my search results dominated by Amazon and eBay sellers peddling pretty much exactly the same stuff?

To be fair, I managed to find one site among the results that’s worth a browse, and the second time I did an image search, the results also included a power-of-two cube that looked interesting. But wow – talk about drowned out by the noise!

This isn’t a ‘things were better in the good old days’ sort of a rant. It’s a clear statement – discovery of sites on the World Wide Web used to be a lot better than it appears to be today.

LXD now runs my WordPress

Here are some notes on how I used LXD to run a container for WordPress. This is (a lot) more convenient than using Docker, which was my original approach to getting my WordPress site into a container.

Getting LXD onto Debian Stretch

LXD is installed on Debian via a Snap package, so sudo apt-get install snapd if this is not already installed. See https://docs.snapcraft.io/installing-snap-on-debian. Then run snap install lxd (see https://stgraber.org/2017/01/18/lxd-on-debian/) and log in again to get an updated command path to the new snap-installed binaries.

Run lxd init to configure the default environment, the storage type I chose was simply directories, since it’s the most convenient for moving files from the Docker setup that I’m migrating from.

Create and configure our container

When lxd is installed, create a new Debian container with lxc launch 'images:debian/9' susanet-wp.

Use lxc exec susanet-wp passwd to set a root password, then lxc console susanet-wp to log into the console. From here we can install the required packages.

apt-get install apache2 php-curl php-gd php-intl php-mbstring php-soap 
php-xml php-xmlrpc php-zip libapache2-mod-php php-mysql
libphp-phpmailer mariadb-server mariadb-client iputils-ping
exim4-daemon-light curl wget netcat

From here it’s pretty much a normal WordPress installation. Since I was migrating from another database, the commands used to get MariaDB set up were as follows: –

create database wp_db
create user 'wp_user'@'localhost' identified by '<db password>'
grant all privileges on wp_db.* to 'wp_user'@'localhost'
flush privileges

I used this command to install my SQL dump file taken from the old Docker setup: –

zcat kakapo_wordpress_db.gz|lxc exec susanet-wp -- mysql wp_db 

Some notes on LXD

LXD creates containers from locally stored images, though these images might themselves be fetched from a remote server.

There are a number of pre-configured public repositories, which can be viewed with lxc remote list, and if you have another LXD installation elsewhere, then this can be used as a further remote server.

The command to register a new remote server is lxc remote add myremote 10.81.1.4, where the IP address is that of another server running LXD, and ‘myremote’ is the alias by which I want to refer to the remote server.

Note that the remote server must be exposed on port 8443 (by default) of the specified IP. A password also has to be defined – clients will be prompted for this when adding this remote server. The following commands will configure the remote server.

lxc config unset core.https_address
lxc config set core.https_address [::]:8443
lxc config set core.trust_password <my_remote's password>

A snapshot in LXD refers to the state of a container as at a specific point in time, and can be used to easily restore the state of the container.

An image can be created from a stopped container, or from a snapshot of a running container. The following commands are listed as examples of usage: –

lxc snapshot susanet-wp my-snapshot
lxc publish susanet-wp/my-snapshot --alias my-new-image
lxc delete susanet-wp/my-snapshot

The snapshot command takes a snapshot of the given container. The publish command creates a local image from this snapshot, and the delete command removes the snapshot (assuming you no longer want it).

Putting the above together, this can be used to copy a container to a backup server. The main local server would be configured to bind to an IP address/socket and given a password, and the backup server adds this as a remote. It can then ‘launch’ this image.

Alternatively, it’s even possible to simply push a local image to a backup server: –

lxc launch my-new-image myremote:susanet-backup

In this case, my local image ‘my-new-image’ is created on a remote server I aliased as ‘myremote’, and the new container on the remote server is called ‘susanet-backup’.

Networking to the outside world

A container can be given an interface on the bridge using something like the following: –

lxc config device add susanet-wp eth0 nic name=eth0 nictype=bridged parent=lxdbr0

We can use DNAT to forward host ports (e.g. on an external IP address) to the bridged interface using something like the following: –

iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j DNAT \
--to-destination 10.102.22.71:80

iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 443 -j DNAT \
--to-destination 10.102.22.71:443

A possibly simpler and more convenient way to connect the container to an external IP address is to use the ‘proxy’ device. This connects an ip:port address on the host to an ip:port in the container, so: –

kevin@vps1:~$ lxc config device add susanet-wp http proxy \
listen=tcp:0.0.0.0:80 connect=tcp:127.0.0.1:80
kevin@vps1:~$ lxc config device add susanet-wp https proxy \
listen=tcp:0.0.0.0:443 connect=tcp:127.0.0.1:443

would connect port 80 on all host interfaces to port 80 on the container’s localhost interface.

Filtering Commercially Driven Content from the Web

The Internet seems awash with ‘click-bait’ and sponsored content – articles created primarily to generate money, sometimes plagiarised, misleading, exaggerated, or provocative just to get views. The good stuff – articles often written simply because it’s good to share knowledge and ideas – is getting harder to find.

My proposal is to create a search engine that, rather than systematically crawl the web, starts with a seed corpus of high quality links, and fans out from there, stopping when the quality drops. The result will hopefully be a searchable index of pages that were created to impart information rather than to earn cash from eyeballs.

As a proxy for quality, I’ll use the number of ‘hits’ emitted by uBlock Origin (uBO) as a page loads. That is, if one page results in 50 blocked requests, then I’d suspect this content is heavily driven by commercial interests, and therefore has a lower likelihood of being original or worthwhile content. If it has 5 blocked requests, there’s a higher likelihood that it’s original and interesting.

There are many things wrong with such a simple assumption, but I think it’s a promising starting point because, of the first 300k links extracted from a known high quality source (Hacker News, as described below),  some 85% of the pages linked to resulted in 9 or fewer uBO hits, while 66% of the pages resulted in 3 hits or fewer. For 31% of these links, uBO didn’t block any requests at all.

As an experiment to determine how feasible this is, I am extracting links from a source with generally high quality content, initially Hacker News (HN) stories and comments, and for each link record URL along with the HN-score. The result is a list of URLs that I then score via uBO, storing the extracted plain text, along with the number of uBO hits, in a MongoDB collection for subsequent indexing.

The main goals are a) to be easily reproducible by others, b) to be resource efficient for cheap VPS deployment, and c) to be scalable as new sources are included. So, I want a low barrier for anyone who wants to join the indexing effort, and the means to grow if more people join.

With a seed of 5M URLs, and a further two levels of, say, 5 links per URL, the index would cover 125M pages, which would be around twice the level of search engines circa 1998. I reckon that could be achieved with 20 low-end VPS instances over a period of 4 months.

Getting the Seed URLs

I use JavaScript on Node to extract story URLs, and any hrefs embedded in comments, from the Hacker News API that’s available on Firebase. The story titles and points, and perhaps even karma of comment authors might also be used as meta-data to score and index the URLs. From a sample of 50K of the most recent HN items, I extracted 15K URLs, but further testing suggest around 25K URLs per 100K HN items.

If this ratio holds, then I’d expect around 5M URLs to process, given that we’re fast heading towards 20M HN items.

Analysing Content of the URLs

I use Puppeteer to access each URLs in the list, and record, for each, the number of requests blocked by uBlock Origin. This is a relatively slow process, since the browser has to fetch the page and process its associated resources, in order to allow uBlock Origin to identify requests that should be blocked.

However, multiple pages can be created in parallel (effectively multiple tabs in Puppeteer), and the system seems capable of handling 20 or more pages at once in the Puppeteer instance. However, this is entirely dependent on the demands of the pages being loaded, and with the 4GB RAM and 2 CPUs on the VPS I’m currently running, the load average hovers between 7 and 10, with close to 100% CPU utilisation for a current average of 1.7 pages per second.

When the page is loaded (when the ‘load’ event has triggered, followed by a predefined delay), I take the original DOM content, and process it through a text extractor (jusText, though I am also considering Dragnet). I intended processing the results through RAKE or Maui to identify keywords and phrases, mainly to cut down on space required and words to index, but decided against this and store the extracted text entirely  – highlighting of search results needs this.

The metadata I keep include the HN score associated with the item (comment or story), the length of the extracted text, and of course the number of requests that uBlock rejected. Karma and user id might also be useful but, particularly with the latter, it feels slightly intrusive to appropriate personal data in this way. The jury’s still out on that.

If I can average one URL processed per second, then the corpus would be processed in around 52 days. Currently, processing is running at 1.7 URLs per second, so if that rate holds, then the seed can be analysed in 30 days. The average size of data held for each URL seems to be just over 6KB (e.g. the URL itself, meta-data, and the extracted body text), so the total source for indexing should require around 30GB of storage, which should fit in the 40GB disk on my current VPS.

The data is, for now at least, stored in a MongoDB server, which forms the source data for the searchable index. Though MongoDB has some great built-in text-search facilities, it breaks due to lack of memory on a low-end VPS (around 1GB free on a 2GB instance), even with a relatively small number of documents. Maybe it could be coerced into working, but I decided going straight to Lucene would be the best way forward.

Build a Search Engine

The search engine itself has been built on Lucence, with a very simple HTML front-end built using Java Servlets, since a JVM would already be running Lucene. It accepts a query string and, optionally, thresholds for uBlock hits and HN scores of the stories or comments that the page’s link came from, and returns the top 100 hits.

Even with mostly default behaviour, Lucene is generating indexes at roughly 30% of the input size, at 15 minutes per 0.5M pages over the network, and is already providing fast searches with queries that support boolean operators (remember + and – to include and exclude words?) and highlighting of results. A truly awesome piece of software, and I feel I’ve barely scratched the surface.

Search results for Facebook
Some search results

 

The index currently has around 750,000 pages, but even so it’s interesting to search on. I get a sense of ‘discovery’ from it, something I feel is missing from the big search engines which, while great at answering very specific questions or finding stuff to buy, seem worse at finding the little gems that made the Internet interesting. However, the URLs were harvested from Hacker News, so I really shouldn’t be too surprised that I find the results interesting.

One of my next tasks is figuring out how best to search a distributed index, or else to merge multiple indexes together. Maybe ElasticSearch or Solr will be a tool for this.

Do you want to help?

When I have a repeatable process, then I’ll push the code so far to GitHub along with instructions on how to build the environment. For this project, I’m running Debian 9.3 on a 2GB and a 4GB 2-CPU VPS server with combined 60GB storage. If this sounds interesting, and you want to contribute (and want to release your contribution under a free-software license), then please get in touch via the comments or directly at kevin@susa.net.

Docker WordPress in a subdirectory

Moving a standard WordPress installation to a different host is a minor pain – I only do this occasionally, so every time I need to consider the configuration of the original environment and how this translates to the new server. Nothing too challenging, but tedious and prone to error.

So I figured Docker containers are the way to go and, sure enough, Docker Hub has more than enough images for my needs. The only issue is that I don’t dedicate my server to WordPress – it’s in a ./wordpress subdirectory of the web root. Docker’s official WordPress image keeps reinstating the WordPress files if they’re not found in the web root.

TL;DR – create the directory wordpress in the container’s web root and add -w="/var/www/html/wordpress" to the docker run (or create) command. This sets the current working directory for docker-entrypoint.sh to work in, and it will install the wp-* and .htaccess files there.

The rest of this post documents my setup, more than anything it’s just a future reference for myself. I’ll start with articles that were references when I started setting this up.

The first is How to Install WordPress with Docker on Ubuntu which is a clearly written tutorial that goes further and uses Nginx as a reverse proxy to the container (I’ve chosen not to do for now). The second is Install fail2ban with Docker which describes what’s required to get fail2ban configured to read a container’s logs. Although I don’t document anything further on this, it’s really useful to inhibit brute-forcing of the server.

Let’s Encrypt Verification via DNS

For me, the least invasive way to verify domain ownership for  SSL/TLS was via DNS TXT records. This avoids the need to integrate with web servers and bind to/forward from a public port, however it does mean that I have to edit my DNS zone to add the required TXT records as instructed by certbot.

certbot certonly --manual --preferred-challenges=dns\
  -d example.com\
  -d www.example.com

At the end of this process, the certificates are placed in /etc/letsencrypt/live. Apache’s SSL virtual host can be configured with the following directives, after the WordPress container is created as detailed below.

SSLCertificateFile /etc/letsencrypt/live/example.com/cert.pem
SSLCertificateKeyFile /etc/letsencrypt/live/example.com/privkey.pem

The /etc/letsencrypt directory can be brought with the container to a new host and the certificates will be made visible to the WordPress container using a bind-mount, as detailed below. If you only want to transfer the above files in /etc/letsencrypt/live, then remember that they’re links into ../archive/, so they must be dereferenced if an archive tool is used (e.g. option --dereference in the  tar command).

Note that renewing certificates requires re-running the above certbot certonly ... command and, if the list of domains is the same, certbot assumes that a manual renewal is required. The certbot renew command, as far as I understand it, is used only for automated (e.g. cron) renewals, and requires some hooked in code to respond to the challenges that the server generates (e.g. to install the TXT records that it asks for). Renewed certificates are not automatically picked up by Apache, so a docker container restart wp-apache2 is required.

Creating the containers

A container is created from the Docker Hub wordpress image, and also from the mariadb image. These run with bind-mounts to expose host directories inside the containers. The directory structure on the host has /opt/wordpress/html for the WordPress container, /opt/wordpress/database for the MariaDB container.

mkdir -p /opt/wordpress/databasemkdir -p /opt/wordpress/html/wordpress

I also mount /etc/letsencrypt/live as a read-only directory for Apache SSL/TLS. This directory must exist on the host, with files susa.net/cert.pem and susa.net/fullkey.pem (Apache needs these for default-ssl.conf).

The following commands assume that these directories have been created, and that valid certificates have been created (see above for notes if this hasn’t been done yet). First, we create the MariaDB container for our WordPress data.

docker run -e MYSQL_ROOT_PASSWORD=<mysqlrootpw>\
 -e MYSQL_USER=wpuser -e MYSQL_PASSWORD=<wpuserpw>\
 -e MYSQL_DATABASE=wordpress_db\
 -v /opt/wordpress/database:/var/lib/mysql\
 --name wp-mariadb -d mariadb

Next we create the WordPress container. This links to the wp-mariadb container we have just created to expose the host as mysql. It also exposes the wp-mariadb container’s environment to our wp-apache2 container, so we unnecessarily divulge, for example MYSQL_ROOT_PASSWORD, to a public Internet server. This is not ideal (e.g. wp-apache2 has no need to know MYSQL_ROOT_PASSWORD), and is probably why --link is being deprecated in favour of Docker networks.

docker run -w="/var/www/html/wordpress"\
 -e WORDPRESS_DB_USER=wpuser\
 -e WORDPRESS_DB_PASSWORD=<wpuserpw>\
 -e WORDPRESS_DB_NAME=wordpress_db\
 -p 80:80 -p 443:443\
 -v /opt/wordpress/html:/var/www/html\
 -v /etc/letsencrypt/live:/etc/letsencrypt/live:ro\
 --link wp-mariadb:mysql --name wp-apache2 -d wordpress

It should now be possible to access Apache on port 80 and port 443, and WordPress should be on the path /wordpress/.

Note the -d flag that detaches the process and returns control back to the calling shell. This is essential if the containers are to run in the background, and can be omitted to keep the process in the foreground, useful when you want logs to be reported to stdout.

Some commands used while setting up the wp-apache2 container.

Most of the commands below should really be brought into to a DockerFile configuration, but it’s convenient for my use case to simply build a baseline image that can ultimately be committed.

docker exec -it wp-apache2 bash  # This takes us to bash in the container
a2enmod ssl   # These commands are run on bash in the container
a2ensite default-ssl
apt-get install vim
vi /etc/apache2/sites-enabled/default-ssl.conf

For wp-mariadb, I often find it useful when testing to get WordPress to respond to a different siteurl, so I created a bash script update_wp_siteurl.sh in the root directory.

#!/bin/bash

if [ "${1}" == "" ]; then
  echo "Usage ${0} URL (such as http://www.susa.net/wordpress)"
  exit 0
fi
mysql -u${MYSQL_USER} -p${MYSQL_PASSWORD} ${MYSQL_DATABASE} <<EOSQL
  update wp_options set option_value = '${1}'
   where option_name in ('home', 'siteurl');
  select option_name, substr(option_value, 1, 60) as option_value
   from wp_options
   where option_name in ('home', 'siteurl');
EOSQL

Managing the images and migration

The following commands will commit images to the repository and save those images and gzipped tar files. I use the tag to denote the host on which the container’s image was created, in my case here I my host is kakapo.

docker commit wp-apache2 wordpress:kakapo
docker commit wp-mariadb mariadb:kakapo
docker image save wordpress:kakapo |gzip > wordpress-kakapo_image.tgz
docker image save mariadb:kakapo |gzip > mariadb-kakapo_image.tgz

The last two lines can be condensed into a single command for convenience.

docker image save\
  wordpress:kakapo\
  mariadb:kakapo | gzip > wordpress-mariadb-kakapo_images.tgz

The saved files can be copied to a remote host and loaded into the new host’s Docker repository.

kevin@kakapo:~$ scp wordpress-mariadb-kakapo_images.tgz newhost.example.com:

The the new host, load the images with

kevin@newhost:~$ zcat wordpress-mariadb-kakapo_images.tgz | docker image load

The newly loaded images can be used to create containers on the new host as described above, only using the images wordpress:kakapo and mariadb:kakapo instead of pulling the official images.

The bind-mounts

Remember that, before creating the containers on a new host, the /opt/wordpress/ and /etc/letsencrypt/ directories have to be transferred and accessible to docker in the location specified by the -v (or --mount) parameter. Depending on the environment, something like rsync or tar then scp should suffice.

On the new host, make sure that ./wp-content/* is writeable by the user and/or group www-data. I usually run Debian, so the host UID/GID for www-data is the same as in /etc/passwd in the WordPress container. Therefore it’s enough to simply chown -R www-data:www-data wp-content.

Generally, when using bind mounts, the permissions have to be considered from the container’s point of view. The directory can be seen both from the host, and from within the container environment. The UID/GID of the files will be interpreted according to the environment that’s reading the filesystem. If a file is writeable only by UID 1000 on the host (whoever that may be), then only processes running as UID 1000 in a container will be able to write the file.

Docker Volumes should really be used instead. They can be managed directly using Docker, so as long as I can also get access to the files from the host, then volumes would be a better way to go. Something to do in future.

I Closed my LinkedIn Account

Admittedly, it had been unused for quite a long time but, regardless, my LinkedIn profile had a few historical recommendations from people I actually knew and respected, so I hesitated before closing it.

The main reason I had for closing my LinkedIn account is to protest in some small way against the lawsuit that LinkedIn are pursuing against hiQ for scraping (automatically fetching and processing) public profiles of members.

I don’t know or care anything much about hiQ or their scraping antics, but LinkedIn pushing to criminalise accessing of public profiles, via a web server bound to a public TCP port, on a publicly visible computer is a dangerous step in the wrong direction.

The argument LinkedIn are trying to make is that they wrote to hiQ saying “You are not authorised to access our web site”, and now claim that subsequent access to their site constitutes criminal ‘hacking’ (e.g. breaking into a computer system to obtain private data).

That’s nonsense, to suit their commercial objectives. The fact that they effectively scraped my address book when I signed up was not unnoticed. They very likely scrape the sites that their millions of members link to in their profiles, posts, and messages. It’s hard to believe that their company grew without analysis of data legally harvested from public web sites.

The consequences of LinkedIn getting their way would be damaging to the Internet (we’d never have had search engines, with such a restricted Internet). There are plenty of technical measures they can take to address their concerns, without trying to foist laws on us all to address their particular commercial concerns.

Increasingly, ‘social’ media companies are becoming, in my opinion, blatantly anti-social. So much so that here we have two companies who have built their business model around surreptitiously tracking and analysing personal data (that in all likelihood was not given for those purposes) having a very public spat that could impact on the original intentions of the web, and on the principles that got us here in the first place.

Atech Postal – notes on the Fast Server

Atech’s Postal is an SMTP server and web management interface that’s geared towards transactional and bulk mailing (e.g. for application to user communication, and for marketing respectively). It’s quite well documented, but more importantly it’s open source (MIT license), and also seems well written – elegant, self-documenting code that’s easy to follow, useful comments, well structured. A bit of a joy really.

The Fast Server is a web server process that’s separate from the management interface server, that’s used to handle requests from click and open tracking links. However, the documentation on the Fast Server process, which is used for logging email Open and Click events, seems to be at least partially out of date, so I thought I’d dig into the code to understand and document the bits that I was unsure of.

According to the docs, the Fast Server is meant to run on a separate IP address. It binds to two ports, one for HTTP, and the other for HTTPS requests. I want to figure out what implications there might be of binding this server to the same IP address as the nginx server, but on different ports.

In this post, I’ll refer to the main Postal management interface as web_server, and the click-tracking server as fast_server. These names correspond to the names ‘web’ and ‘fast’ in the process list returned by the ‘postal status’ command.

I’ll refer to the IP addresses as primary, for the main IP address used for web_server and the SMTP server, and secondary, for the IP address required by fast_server according to the documentation. I’ll use 192.168.123.1 and 192.168.123.2 as placeholders in the text for primary and secondary IP addresses respectively.

First, the 2 IP configuration

Before that, since Linode have since kindly allocated an extra IP address to the node for me, I’ll cover what’s required for a 2 IP configuration.

The nginx config

It might be worth clarifying here that nginx binds to the primary IP simply to proxy requests to the web_server process bound to localhost:5000. On the other hand, fast_server by default binds directly to the secondary IP, ports 80 and 443.

Therefore, we need to stop nginx binding to all interfaces, as it’s configured to do by default. The change required is to change the ‘listen’ directives for the IPv4 addresses in the nginx configuration. So we need something like the following, substituting 192.168.123.1 with your primary IP : –

listen 192.168.123.1:80;

and, for the SSL server block : –

listen 192.168.123.1:443 ssl;

When nginx is restarted, it’s no longer bound to the second IP address, making it available for use with the click tracking ‘fast_server’ process.

The DNS Zone Configuration

The DNS configuration should include (among others) the following entries for the IP addresses (assuming this is in the example.com zone): –

postal          A     192.168.123.1
track.postal    A     192.168.123.2
click           CNAME track.postal.example.com.

So postal.example.com will resolve to the primary IP, track.postal.example.com resolves to the secondary IP, and click.example.com resolves to the canonical track.postal.example.com (which subsequently resolves to the secondary IP).

When tracking opening of emails and the links contained within, it is click.example.com that’s embedded in the HTML of the email to provide the tracking image and rewritten links.

The rest

When the fast_server is enabled in ‘./config/postal.yml’, Postal needs to be stopped and then started (restart alone doesn’t seem to bring the fast_server process online).

From here, it’s simply a matter of adding a new ‘Domain -> Tracking Domain’ from the web interface.

Possible single-IP configuration

The config in ./spec/config/postal.yml defines fast_server’s ports as 5010 (HTTP) and 5011 (HTTPS). However they are defined as 80 and 443 in the file ./config/postal.defaults.yml, which is the base configuration file that is overlaid with ./config/postal.yml at runtime.

This suggests to that the Fast Server can at least run correctly on non-standard ports. I would guess is that links to non-standard ports are more likely to be blocked by (e.g. corporate) firewalls, or perhaps emails containing such links are seen as more spammy by some mail hosts. Whatever the reasons, the developers clearly felt that it was best to use a separate IP address for fast_server.

There is an example configuration on the GitHub project, issue 321, which binds fast_server to the standard ports 80/443, and web_server to port 8080. It reportedly runs fine like this, however one thing that I think needs to to be confirmed is whether or not CertBot (or whatever ACME client is being used) can renew the certificates when nginx is running on a non-standard port.

 

The Rack Interface is itself invoked by the Client class, which is instantiated and invoked in fast_server/Server.rb when a connection is made to fast_server.

Binding to SMTP and HTTP (privileged) ports

The ruby executable absolutely must have permission to bind to privileged ports. This is required for the fast_server and the SMTP server processes. Since this permission is an extended attribute of the file, upgrading the ruby package may mean the permission is lost and needs to be reinstated.

The command from Postal’s installation script is: –

sudo setcap 'cap_net_bind_service=+ep' /usr/bin/ruby2.3

What makes this an issue is that there’s nothing in the logs to show why the server processes have failed to start. The file log/rails.log only shows ‘Raven 2.1.0 configured not to capture errors’. Perhaps the configuration file ‘app/config/environments/production.rb’ can be edited to provide more useful information?

Random Notes

./lib/postal/message_db/message.rb -> create_load(request) is invoked by the Rack Interface class when an image URL is requested. This is Open Tracking, and ultimately results in the ‘load’ details being applied to the database.

Braun ThermoScan Fix – Low Battery Warning Switch Off

We have a Braun Thermoscan infra-red (IR) thermometer that has been working perfectly for about five years. It started complaining about low batteries and shutting off, despite me replacing with new batteries that I checked had plenty of charge.

When I opened it, I discovered that the batteries connect to the circuit board via simple metal clip contacts, and that the contacts had some corrosion on them, which was preventing power from getting to the board, hence why it was complaining of low batteries.

So a very simple fix is to just clean the corrosion from the battery terminals inside the thermometer. You’ll need a Torx T9 screwdriver (Maplin, eBay, Amazon, maybe pound shops).
Continue reading

Arduino Yun Reading WH1080 using AUREL RX-4MM5

Here’s the sketch, it just reads and dumps to the console, the bridge can be used to send the data to the GNU/Linux side of the Yun.

See the other post on doing this with a Raspberry Pi for some code to turn the data into something useful.

I’m using the MCU of the Yun to do the RF stuff, and using the AUREL RX-4MM5 (a proper OOK receiver), it seems a lot more dependable than the Raspberry Pi + RFM01 (or RFM12B). Continue reading

Raspberry Pi Power Controller

This article is a work in progress to create a power-controller for the Raspberry Pi based on a PIC microcontroller and MOSFET. The PIC implements an I2C slave to allow power control, and also to approximate the registers of a PCF8563 Real Time Clock (RTC) chip, to allow timed wake-up of the Pi.

  • Power the Raspberry Pi off and on with a push-button.
  • Fully shut down the Raspberry Pi on ‘shutdown -h’.
  • Wake-up at a specified time (one-off or periodic).
  • Monitor the supply voltage.
  • Log glitches in the power-supply (e.g. caused by USB device activity).
  • Maintains the time from a CR2032 button cell.

During power-down, the circuit currently consumes around 5μA of power, useful where a battery is being used to power the Pi (remote solar-power applications, or in-car systems, for example).

The Pi is able to instruct the PIC to power it down using a short I2C command sequence. Wake up events include a push-button, or other voltage-sense on an input pin. Continue reading

PIC/MOSFET PWM Model Train Controller

Having been unable to resist buying some old Hornby OO Gauge bits from the second hand cabinet in a model shop, justification came from the educational value it would offer my son if I could make a speed controller, perhaps adding a sensor or two – the essence of industrial control and feedback mechanisms. Being three and a half, he just wanted to make the train fly off the track, but at least he enjoyed it.

This is a project to create a model train speed controller using the Pulse Width Modulation (PWM) output of a PIC16F690 microcontroller, to drive a MOSFET that ultimately controls the voltage on the tracks. The train will automatically switch into reverse when the control is turned anti-clockwise through the zero point. Continue reading

Raspberry Pi reading WH1081 weather sensors using an RFM01 and RFM12b

This article describes using an RFM01 or RFM12b FSK RF transceiver with a Raspberry Pi to receive sensor data from a Fine Offset WH1080 or WH1081 (specifically a Maplin N96GY) weather station’s RF transmitter.

I originally used the RFM12b, simply because I had one to hand, but later found that the RFM01 appears to work far better – the noise immunity and the range of the RFM01 in OOK mode is noticeably better.  They’re pin compatible, but the SPI registers differ between the modules, in terms of both register-address and function.

This project is changing to be microcontroller based, and using an AM receiver module (Aurel RX-4MM5) – a much more effective approach – arduino-yun-reading-wh1080-using-aurel-rx-4mm5. Currently testing on Arduino Yun, but will probably move to a more platform agnostic design to support Dragino and Carambola etc.

Continue reading

Raspberry Pi GPFSEL, GPIO, and PADS Status Viewer

The gpfsel_list (I maybe should have called it lsgpio) utility displays a list of the currently configured function selections across all available GPIO pins and, for pins configured as GPIO, the current state of the pins. For pins configured with ALTn functions, the selected function is listed according to the datasheet information.

It also shows the state of the PADS registers to display the configured drive current, hysteresis, and slew setting for the three groups of pins (GPIO 0-27, 28-45, and 46-53).

It’s been written to produce output that’s easy to grep and cut, and performs only read operations on the registers – it can’t be used to modify settings, though I suppose this could change in future.

Continue reading

Raspberry Pi – Driving a Relay using GPIO

There’s something exciting about crossing the boundary between the abstract world of software and the physical ‘real world’, and a relay driven from a GPIO pin seemed like a good example of this. Although a simple project, I still learned some new things about the Raspberry Pi while doing it.

There are only four components required, and the cost for these is around 70p, so it would be a good candidate for a classroom exercise. Even a cheap relay like the Omron G5LA-1 5DC can switch loads of 10A at 240V. Continue reading

Raspberry Pi PCF8563 Real Time Clock (RTC)

Having recently received my Raspberry Pi, one of the first things I wanted to do was hook up a real-time clock chip I had lying around (a NXP PCF8563) and learn how to drive I2C from the BCM2835 hardware registers. Turns out it’s quite easy to do, and I think makes a useful project to learn with.

So, here are some notes I made getting it to work, initially with Chris Boot’s forked kernel that incorporates some I2C handling code created by Frank Buss into the kernel’s I2C bus driver framework.

After getting it to work with the kernel drivers, I created some C code to drive the RTC chip directly using the BCM2835 I2C registers, using mmap() to expose Peripheral IO space in the user’s (virtual) memory map, the technique I learned from Gert’s Gertboard demo software, though my code’s simpler (hopefully without limiting functionality!).

Note: Revision 2 boards require the code to access BSC1 (I2C1) rather than BSC0 (I2C0), so changes to the peripheral base address may be required, or in the case if the Linux I2C driver, a reference to i2c-1 rather than i2c-0. It should be simple enough, but I don’t want to write about things I haven’t done or tested, so a bit of extra work by the reader may be required.

Continue reading

Raspberry Pi – Hardware Is The New Software

There’s been a lot written about the Raspberry Pi, a small single-board computer with I/O pins on the circuit board, and a small price tag (£25 or so). At the time of writing, it’s not yet available to buy, but there’s been a lot of interest in the pre-production versions and the promise of an imminent launch. Continue reading

Oil Watchman Power Tube Batteries

Here’s a little tip for anyone with an Oil Watchman tank gauge. If your batteries run out, you don’t need to spend £30 or so replacing it. You can open the tube and replace the four AAA cells that are inside, it’s a simple job – five minutes if you’re well organised, but allow half an hour if you prefer to take your time. Continue reading

SmartAlpha 433 MHz (and 866 MHz) RF Transceiver Module Notes

This entry is partly just a space for me to make notes on the SmartAlpha module from RF Solutions. If you want to contact me on any of the information contained here, please email kevin at susa dot net.

Please note that since writing this, the datasheet has changed on these devices and, at first glance, some of the anomalies I’ve seen look to have been addressed. Please also note that my observations were initially done by just running code and watching display-text and LEDs, and though I’ve since used an analyser to verify my understanding, I’ve not been particularly methodical in my approach. Continue reading