Backdate original posts - website

commit 9070fddd10448ed35fadde4078469b2522a46319
parent d89495bab1c1645e9ddb43a8e2cc1e8143b7fb07
Author: Mikołaj Lenczewski <mblenczewski@gmail.com>
Date:   Mon, 20 Jan 2025 15:30:38 +0000

Backdate original posts

Diffstat:
A date.sh  | 8 ++++++++
M src/blog/cdoku.html  | 45 ++++++++++++++++-----------------------------
M src/blog/clfs.html  | 8 +++-----
M src/blog/pyecs.html  | 84 +++++++++++++++++++++++++++++++++++--------------------------------------------

4 files changed, 64 insertions(+), 81 deletions(-)
diff --git a/date.sh b/date.sh
@@ -0,0 +1,8 @@
+#!/bin/sh
+
+set -ex
+
+touch -d 20200910 src/blog/test.html 
+touch -d 20210514 src/blog/cdoku.html 
+touch -d 20201112 src/blog/pyecs.html 
+touch -d 20201010 src/blog/clfs.html
diff --git a/src/blog/cdoku.html b/src/blog/cdoku.html
@@ -9,8 +9,7 @@
   which couldn&#39;t be completed after 3 hours of runtime:
 </p>
 
-<pre>
-<code class="language-text">
+<pre><code class="language-text">
  . . . | . . . | . . .
  . . . | . . 3 | . 8 5
  . . 1 | . 2 . | . . .
@@ -22,8 +21,7 @@
  5 . . | . . . | . 7 3
  . . 2 | . 1 . | . . .
  . . . | . 4 . | . . 9
-</code>
-</pre>
+</code></pre>
 
 <p>
   This case is pathological, because the initial row consists of the numbers 
@@ -43,8 +41,7 @@
   in X:
 </p>
 
-<pre>
-<code class="language-text">
+<pre><code class="language-text">
 For example:
 X = { 0, 1, 2, 3, 4, 5, 6 }
 
@@ -53,8 +50,7 @@ A = { 0, 1, 3 }
 B = { 2, 4, 5 }
 C = { 2, 6 }
 D = { 6 }
-</code>
-</pre>
+</code></pre>
 
 <p>
   In an exact cover problem, you are trying to find a subset of S such that each 
@@ -93,8 +89,7 @@ and our possibility set S will be the union of all numer-position pairs.
   Algorithm X works as follows:
 </p>
 
-<pre>
-<code class="language-text">
+<pre><code class="language-text">
 Let M be our possibility-constraint matrix
 Let M(i,j) denote the value in column i, row j
 
@@ -107,8 +102,7 @@ Let M(i,j) denote the value in column i, row j
          Remove row i from the matrix
       Remove column j from the matrix
 6. Recurse on the new, reduced matrix M
-</code>
-</pre>
+</code></pre>
 
 <p>
   Knuth recommends picking the column with the fewest 1s in step 2. Although any 
@@ -117,8 +111,7 @@ Let M(i,j) denote the value in column i, row j
   our previous example problem, heres how Algorithm X would run through the problem:
 </p>
 
-<pre>
-<code class="language-text">
+<pre><code class="language-text">
 X = { 0, 1, 2, 3, 4, 5, 6 }
 
 S = { A, B, C, D }
@@ -183,8 +176,7 @@ M:
 19. M has no columns, return true
 
 Therefore, S = { A, B, D}
-</code>
-</pre>
+</code></pre>
 
 <p>
   As we can see, the algorithm is quite efficient. We didn&#39;t have to make many 
@@ -206,8 +198,7 @@ Therefore, S = { A, B, D}
   An example of what the linked-list based matrix would look like for our example:
 </p>
 
-<pre>
-<code class="language-text">
+<pre><code class="language-text">
 &quot;Dense&quot; Matrix:
     0 1 2 3 4 5 6 
 A [ 1 1 0 1 0 0 0 ]
@@ -222,8 +213,7 @@ A | A0 &lt;&gt; A1 &lt;------&gt; A3
 B |             B2 &lt;------&gt; B4 &lt;&gt; B5
 C |             C2 &lt;------------------&gt; C6
 D |                                     D6
-</code>
-</pre>
+</code></pre>
 
 <p>
   All nodes in a column are doubly-linked to the elements above them and below 
@@ -250,15 +240,13 @@ D |                                     D6
   Splicing an element out of a linked list:
 </p>
 
-<pre>
-<code class="language-text">
+<pre><code class="language-text">
 A &lt;---&gt; B &lt;---&gt; C
 
 After splicing out B:
 A &lt;---&gt; C
  \_ B _/
-</code>
-</pre>
+</code></pre>
 
 <p>
   Here, B has retained its previous pointer to A and its next pointer to C. It can 
@@ -267,18 +255,17 @@ A &lt;---&gt; C
   itself (B):
 </p>
 
-<pre>
-<code class="language-text">
+<pre><code class="language-text">
 A &lt;---&gt; C
  \_ B _/
 
 After splicing in B:
 A &lt;---&gt; B &lt;---&gt; C
-</code>
-</pre>
+</code></pre>
 
 <h3>Implementation</h3>
 <p>
   Although this post has mostly been on Algorithm X, and hasn&#39;t shown any code, 
-  my implementation (in C) can be found <a href="https://github.com/mblenczewski/cdoku">here</a>.
+  my implementation (in C) can be found
+  <span><a href="https://git.lenczewski.org/cdoku/log.html">here</a></span>.
 </p>
diff --git a/src/blog/clfs.html b/src/blog/clfs.html
@@ -18,16 +18,14 @@
   correct sequence of operations:
 </p>
 
-<pre>
-  <code class="language-text">
+<pre><code class="language-text">
 1)   Build barebones binutils for your chosen arch [no libc]
 2)   Build barebones gcc for your chosen arch      [no libc]
 3)   Compile your libc for your chosen arch
 3.1) Compile your libc++ for your chosen arch
 4)   Recompile binutils for your chosen arch
 5)   Recompile gcc for your chosen arch
-  </code>
-</pre>
+</code></pre>
 
 <aside>Step 3.1 is optional if you don't want to support C++ in your cross-gcc</aside>
 
@@ -59,5 +57,5 @@
   a wall. After hopping onto irc and asking for help, these issues were
   eventually resolved and I managed to continue with the project. The source
   code for the build script can be found
-  <span><a href="https://github.com/mblenczewski/clfs-bootstrap">here</a></span>.
+  <span><a href="https://git.lenczewski.org/clfs/log.html">here</a></span>.
 </p>
diff --git a/src/blog/pyecs.html b/src/blog/pyecs.html
@@ -62,18 +62,16 @@
   "purity" and simplicity reasons.
 </p>
 
-<pre>
-<code class="language-c">
+<pre><code class="language-c">
 typedef unsigned long entity_t; // as simple as a uint64
 
 typedef struct Entity {
-	unsigned long id;
-	unsigned long flags;
-	unsigned long tags;
-	// ...
+  unsigned long id;
+  unsigned long flags;
+  unsigned long tags;
+  // ...
 } Entity_t; // or something more complex
-</code>
-</pre>
+</code></pre>
 
 <h3>Components</h3>
 <p>
@@ -101,8 +99,7 @@ typedef struct Entity {
   from an intel pdf are as follows:
 </p>
 
-<pre>
-<code class="language-text">
+<pre><code class="language-text">
 Data Source                                             Latency
 L3 Cache hit, line unshared between cores               ~40 cycles
 L3 Cache hit, line shared readonly between cores        ~65 cycles
@@ -110,8 +107,7 @@ L3 Cache hit, line shared readwrite between cores       ~75 cycles
 L3 Cache hit, from other core                           ~100-300 cycles
 Local DRAM                                              ~60ns
 Remote DRAM                                             ~100ns
-</code>
-</pre>
+</code></pre>
 
 <p>
   For a CPU clocked at 2GHz (slower than modern CPUs), each cycle takes half a 
@@ -128,55 +124,53 @@ Remote DRAM                                             ~100ns
   Getting back to why components are a thing, with an example:
 </p>
 
-<pre>
-<code class="language-c">
+<pre><code class="language-c">
 // clean separation into multiple discrete components
 typedef struct Transform {
-	double px, py, pz;
+  double px, py, pz;
 } Transform_t;
 
 typedef struct Velocity {
-	double vx, vy, vz;
+  double vx, vy, vz;
 } Velocity_t;
 
 typedef struct Mesh {
-	void *mesh_data;
+  void *mesh_data;
 } Mesh_t;
 
 // chucking it all into one mixer and pressing "blend"
 typedef struct State {
-	Transform transform;
-	Velocity velocity;
-	Mesh mesh;
+  Transform transform;
+  Velocity velocity;
+  Mesh mesh;
 } State_t;
 
 // ...
 
 void main(void) {
-	// this approach will have worse cacheline efficiency
-	State game_objects[] = { ... };
+  // this approach will have worse cacheline efficiency
+  State game_objects[] = { ... };
 	
-	for (int i = 0; i < ARRLEN(game_objects); i++) {
-		calculate_physics(game_objects[i]);
-		render_object(game_objects[i]);
-	}
+  for (int i = 0; i < ARRLEN(game_objects); i++) {
+    calculate_physics(game_objects[i]);
+    render_object(game_objects[i]);
+  }
 
-	// this approach will have better cacheline efficiency
-	Entity entities[] = { ... };
-	Transform transforms[] = { ... };
-	Velocity velocities[] = { ... };
-	Mesh meshes[] = { ... };
+  // this approach will have better cacheline efficiency
+  Entity entities[] = { ... };
+  Transform transforms[] = { ... };
+  Velocity velocities[] = { ... };
+  Mesh meshes[] = { ... };
 
-	for (int i = 0; i < ARRLEN(entities); i++) {
-		process_physics(transforms[i], velocities[i]);
-	}
+  for (int i = 0; i < ARRLEN(entities); i++) {
+    process_physics(transforms[i], velocities[i]);
+  }
 
-	for (int i = 0; i < ARRLEN(entities); i++) {
-		render(transforms[i], meshes[i]);
-	}
+  for (int i = 0; i < ARRLEN(entities); i++) {
+    render(transforms[i], meshes[i]);
+  }
 }
-</code>
-</pre>
+</code></pre>
 
 <p>
   Why is the second approach better? Don't we iterate twice compared to once? The 
@@ -189,8 +183,7 @@ void main(void) {
   for Transform, Velocity, and Mesh. The dashes show a simplified cacheline.
 </p>
 
-<pre>
-<code class="language-text">
+<pre><code class="language-text">
 Mixed State:
 | x0 y0 z0 x1 | y1 z1 x2 y2 | z2 x3 y3 z3 | x4 y4 z4    |
 | ----------- | ----------- | ----------- | ----------- |
@@ -201,8 +194,7 @@ Components:
 
 | x0 z0 x1 z1 | x2 z2 x3 z3 | x4 z4       |
 | ----------- | ----------- | ----------- |
-</code>
-</pre>
+</code></pre>
 
 <p>
   Not only does processing the mixed state do worse on the instruction cache, it 
@@ -253,16 +245,14 @@ Components:
   as discussed earlier.
 </p>
 
-<pre>
-<code class="language-python">
+<pre><code class="language-python">
 class RenderSystem:
   def actions():
     return {
       self.process_active: (Transform2D, ScreenElement),
       self.process_stale: (StaleTag,),
     }
-</code>
-</pre>
+</code></pre>
 
 <p>
   This is the approach I took, where each System has an <code>actions()</code>

	website website.git
	git clone git://git.lenczewski.org/website.git
	Log \| Files \| Refs

A	date.sh	\|	8	++++++++
M	src/blog/cdoku.html	\|	45	++++++++++++++++-----------------------------
M	src/blog/clfs.html	\|	8	+++-----
M	src/blog/pyecs.html	\|	84	+++++++++++++++++++++++++++++++++++--------------------------------------------