collin park: computers

Showing posts with label computers. Show all posts

Saturday, July 27, 2024

When `evince` says ‘Unable to open document "<BLAH>". PDF document is damaged’

During the past few months I’ve been “Docusign”ing things, and the website lets me download the finished document, which is great. But until now I haven’t been able to view them using CLI tools unless I pointed a browser (firefox in my case) at them.

Finally I applied the LMGTFY principle (the 21st century version of RTFM), I did a web search on the message, which eventually led me to this nice article on superuser.com. The short version of The Answer is:

gs -o FIXED.pdf -sDEVICE=pdfwrite  -dPDFSETTINGS=/prepress CORRUPTED.pdf

That↑ worked great for me, though someone recommends instad -dPDFSETTINGS=/default.

Tuesday, January 16, 2024

Suddenly my automounts don't work... and a hack fix

The server, p64, is debian 11 (bullseye); the client is mac os x (darwin kernel version 23.1.0 Mon Oct 9 21:27:27 PDT 2023. Automounts used to work but somehow stopped, I'm not sure when. I have homedirs on Linux, as /home; I want to read/write the Linux /home as, uh, /home.

After searching frantically, here is something that kinda works. First, on linux:

sudo systemctl start rpc-statd
sudo service rpcbind start

Then, on mac os, forget about automounting, just do it manually. As root:
mount -o resvport p64:/home /home
and it all works.

I want to figure out the "real" automounter problem, but right now I just want to do what I sat down to do.

Saturday, January 14, 2023

assimilating a new (to us) imac: SMTP mail

So the mac mini is almost a teenager so we got a new box. According to "About this Mac" it's "iMac (retina 5K..., 2019)" running macOS Monterey 12.0.1. I need to get dovecot on it, among other things.

<…time passes…>

OK that was... October maybe? I moved "all" the files with either scp or rsync, upgraded to Ventura, installed crashplan for small business and told it to back up Carol's files (and to stop backing up the old mac mini). Carol's been using the new machine to good effect for a few months now, but I'm still using the mini to fetch SMTP mail from my ISP. The setup is byzantine, and in case I'm still using that email when we replace the 2019 iMac, I'll record how it handles smtp mail for my future self.

Fetching the mail here

The mini runs a "service"... I thought we could "fetchmail -d 60" but how to send password encrypted? It would certainly be bad medicine authenticating in cleartext!

The solution involves an ssh tunnel and a macos "service." Apparently if you put an XML file named somethign.plist in /System/Library/LaunchDaemons/ then macos will run it as root on startup. Mine looked like this:

unknownc42c0321f10e:~ admin$ cat /System/Library/LaunchDaemons/collin.admin.tunnel.plist         
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>Label</key>
		<string>collin.admin.tunnel</string>
	<key>Program</key>
		<string>/Users/admin/tunnel.sh</string>
	<key>RunAtLoad</key> 
		<true/>
</dict>
</plist>
unknownc42c0321f10e:~ admin$

OK actually on the mini the username was "postman"; on the imac it'll be "admin" so I'm changing it here.

OOPS... on the iMac, running Ventura, we can't touch /System/Library/LaunchDaemons; instead I had to add the above as /Library/LaunchDaemons/collin.admin.tunnel.plist; I hope it works.

What does /Users/admin/tunnel.sh do? It establishes a tunnel to my ISP, making localhost port 60110 tunnel to the POP server's port 110 for about a minute or two. Then it runs fetchmail. Like this:

#!/bin/bash
ID=$(id -u)
if [[ $ID == 0 ]] ; then
    echo /Users/admin/tunnel.sh | su - admin
    exit 0
fi
PATH=$PATH:/usr/sbin:/opt/local/bin:/usr/bin:/bin
while :; do
        if netstat -an -finet|grep LISTEN | grep 60110 > /dev/null; then
                : be happy
        else
                ssh -f sonic -L 60110:pop.sonic.net:110 sleep 120 & >/dev/null
        fi
        sleep 30                # That should be long enough to open socket
        fetchmail --sslproto "" >> tmp/fetchmail.log 2>&1 &
        FPID=$!
        sleep 120
        kill $FPID
        sleep 10
done

I had a little surprise with the .fetchmailrc: I can't say

poll localhost proto pop3 port 60110 user ISP-username pass ISP-password is admin here fetchall mda "/usr/bin/sendmail -i -f %F %T"

because procmail won't let me fetch from localhost. So I have a hack in /etc/hosts:

127.0.0.1       localhost see.admin.fetchmailrc.invalid

and now fetchmail can, well, fetch the mail.

AND ANOTHER THING... I never used to have to say “--sslproto ""” but it now seems necessary lest I get some SSL error.

Once the mail gets here

… sendmail (or maybe postfix) will try to deliver it, probably to /var/spool/mail/WHATEVER. But we don't want that, so we have to supply a .forward file:

admin@Admins-iMac-2 ~ % cat .forward                                                           
"|/opt/local/bin/procmail"
admin@Admins-iMac-2 ~ %

And a .procmailrc, which tries to figure out who the email is addressed to. If there's a header that says
To: collin@<ourdomain>
then that's easy; it's addressed to me.

But what if there's no header like that? What if I'm bcc:-ed? Basically we look for a useful Received: header. Anyway, the point is, admin's .procmailrc file tries to figure out who the email is for, and then it sends the email to Carol or to me, or to the bit-bucket. It sends the email to us by running /usr/sbin/sendmail, so if I want email processed by procmail, I again have to have a $HOME/.forward, just like "admin" did. And my own $HOME/.procmailrc.

Other stuff

I have to run dovecot on the iMac, but only for Carol's email. She hasn't looked at it for months now, so when she decides to have a look, I'll probably have to figure out how to run dovecot on it.

As for me, I'll nfs-mount $HOME/Maildir from the iMac onto my linux box, which is where I read non-web email. The iMac wasn't exporting any filesystems when we brought it home, so I just did what came naturally: copy /etc/exports from the teen-aged mac mini:

admin@Admins-iMac-2 ~ % cat /etc/exports                                                       
/Users  -network 192.168.1.0 -mask 255.255.255.0

I'll mount that and symlink Maildir there to $HOME/Maildir on the Linux box.

Then I think I should remove /System/Library/LaunchDaemons/collin.postman.tunnel.plist from the mac mini... oh, wait, no, I don't have to do that; I can just make the script do nothing I think.

Then rsync to make the iMac's copy of $HOME/Maildir match the mac mini's copy... for both Carol and me

Admins-iMac-2:~ carol$ time rsync -av 192.168.1.131:Maildir ./
receiving file list ... done
Maildir/
Maildir/log
Maildir/msgid.cache
Maildir/new/
Maildir/new/1673726769.51227_2.unknownc42c0321f10e.attlocal.net
Maildir/new/1673737810.52745_2.unknownc42c0321f10e.attlocal.net
Maildir/new/1673746452.53955_2.unknownc42c0321f10e.attlocal.net
Maildir/tmp/

sent 66405 bytes  received 520272 bytes  7287.91 bytes/sec
total size is 675278927  speedup is 1151.02

real	1m20.414s
user	0m0.292s
sys	0m0.200s
Admins-iMac-2:~ carol$

Mine will take rather longer I think...

Then install

Monday, July 25, 2022

upgrade debian stretch → buster → bullseye

I hate upgrades, but python3 on my debian stretch box doesn't grok f-strings, because its python3 is python3.5; fstrings were added in python3.6. It’s been a couple years since I upgraded to stretch (debian9), and security updates have just been discontinued, so I thought, why not skip buster (debian10) and just upgrade to debian11 (bullseye)? Then maybe I could wait four years rather than two before having to do it again.

Of course I didn't see any instructions for a 2-release upgrade, so the first thing was to upgrade stretch to buster. I basically followed the instructions in this article. As root:

change “stretch” to “buster” in /etc/apt/sources.list
Casting all caution to the wind, I skipped the part about making a backup
apt-get update; apt-get upgrade; apt-get dist-upgrade
reboot

That’s all I did. There was one surprise: thunderbird complained about not being able to connect to “mini1” but I had no idea why. I did, however, try to ssh there from my now-buster desktop; passwordless ssh failed. “ssh -v” showed me that I had the wrong kind of keys now, so I regenerated keys on mini1 (a mac mini), added the the contents of .ssh/id_mini1.pub mini1’s .ssh/authorized_keys, and copied the private key into .ssh/id_mini1. Things started looking better. But thunderbird still said it couldn't connect to mini1. Why was that? The server settings said it was connecting to 127.0.0.1:143; was I running dovecot locally? I had to be, right? Yes, according to /etc/dovecot/dovecot.conf:

 26 # A comma separated list of IPs or hosts where to listen in for connections.
 27 # "*" listens in all IPv4 interfaces, "::" listens in all IPv6 interfaces.
 28 # If you want to specify non-default ports or anything more complex,
 29 # edit conf.d/master.conf.
 30 #listen = *, ::
 31 listen = 127.0.0.2

So when I tried "sudo dovecot", it said the ssl key couldn’t be found, and even gave me a pathname. So I commented it out in /etc/dovecot/conf.d/10-ssl.conf:

  1 ##
  2 ## SSL settings
  3 ##
  4 
  5 # SSL/TLS support: yes, no, required. 
  6 ssl = no                                           ←was “yes”
  7 
  8 # PEM encoded X.509 SSL/TLS certificate and private key. They're opened before
  9 # dropping root privileges, so keep the key file unreadable by anyone but
 10 # root. Included doc/mkcert.sh can be used to easily generate self-signed
 11 # certificate, just make sure to update the domains in dovecot-openssl.cnf
 12 ssl_cert = </etc/dovecot/private/dovecot.pem
 13 #ssl_key = </etc/dovecot/private/dovecot.key       ←commented out

I also decided that we don’t need SSL, since hey, 127.0.0.2.

I think I’ll have to do something about the scanner, but otherwise i believe phase I (Stretch → Buster) was pretty easy.

Phase II: Buster → Bullseye

The first part was straightforward but took ... a couple hours? As above, all these steps were done as root:

change /etc/apt/sources.list to refer to Bullseye rather than Buster
apt-get update; apt-get ugrade; apt-get dist-upgrade
reboot

And then…
“…something about the scanner” ⇐ THIS
So xsane couldn't find the scanner. I got some advice to “sudo scanimage -L” which found only another device (our renter's all-in-one).

A web search on “brother scanners linux” (no quotes) led me to Install the scanner driver (deb) - Linux - BrotherUSA, where I learned what to do, again as root:

collin@p64:~$ brsaneconfig4 -q
                                                             ← nothing appeared here at all!
collin@p64:~$ sudo brsaneconfig4 -a name=mfc9340 model=MFC-9340CDW ip=192.168.1.40
collin@p64:~$ brsaneconfig4 -q
* *MFC-9340CDW [   192.168.1.40]  mfc9340                    ← Now that’s more like it!
collin@p64:~$ sudo scanimage -L
device `brother4:net1;dev0' is a Brother mfc9340 MFC-9340CDW
device `escl:https://192.168.0.235:443' is a HP ENVY 6400 series [BA2627] SSL flatbed scanner
collin@p64:~$

And with that, the scanner works. Mail works. Browsers (both firefox and chrome) work. Maybe something else won’t, but that's all for today.

July 29 update: auto-sleep, crashplan

A new-ish feature in Bullseye (it may have come in with Buster?) is that the box will sleep if I walk away for 20 minutes or so. This is fine, except when I'm logged in over VPN and I'm using vmware horizon (yes i have to use that for work). I need to disable it when I'm at work, so I just leave the Settings app up, with Power settings selected. Then it's front and center and it's obvious to me that auto-sleep is either on or off. Usually I re-enable it when I'm done with work for the day.

I got email yesterday or so, saying that my backups haven't happened since the OS upgrade. My first thought was, oh, it's because of being asleep. But then I turned off auto-sleep and tried logging in to the code42 app... no joy. After thrashing for a while, I went to the code42.com support site, where I couldn't login. Oh, I had forgotten that I'd already added google authenticator to firefox! I used it to get my magic rotating number and I was in. I saw the advice to reinstall the app. OK, fine; I downloaded the package, did what came naturally, and said: sudo /usr/local/crashplan/service.sh start

Which didn't work. I looked at /usr/local/crashplan/log/service.log.0 and... something about missing libuaw.so; a web search led me to this post on reddit with the answer: zcat the "CrashPlanSmb_10.0.0.cpi" file (it's in the downloaded tarball), find libuaw.so in the appropriate subdirectory of nlib/ (they didn't have "debian11" but the Reddit-or said ubuntu20, which thankfully worked) and copy it into /usr/local/crashplan/nlib; shazzam! up and running.

August 20 update: convert(1) issue solved; crashplan, not so much

I wanted to convert a ".png" file to PDF, as I have done many times before, but now I get:

convert-im6.q16: attempt to perform an operation not allowed by the security policy `PDF' @ error/constitute.c/IsCoderAuthorized/421.

How annoying! A web search got me to this stackoverflow post; I followed its advice, but kept the old /etc/ImageMagick-6/policy.xml as /etc/ImageMagick-6/policy.xml-dist; here's the diff:

$ diff -u /etc/ImageMagick-6/policy.xml{-dist,}
--- /etc/ImageMagick-6/policy.xml-dist	2021-04-20 07:37:59.000000000 -0700
+++ /etc/ImageMagick-6/policy.xml	2022-08-20 20:19:06.962756505 -0700
@@ -90,10 +90,12 @@
   <!-- in order to avoid to get image with password text -->
   <policy domain="path" rights="none" pattern="@*"/>
   <!-- disable ghostscript format types -->
-  <policy domain="coder" rights="none" pattern="PS" />
+  <!--  -->
+  <policy domain="coder" rights="read|write" pattern="PS" />
   <policy domain="coder" rights="none" pattern="PS2" />
   <policy domain="coder" rights="none" pattern="PS3" />
   <policy domain="coder" rights="none" pattern="EPS" />
-  <policy domain="coder" rights="none" pattern="PDF" />
+  <!--  -->
+  <policy domain="coder" rights="read|write" pattern="PDF" />
   <policy domain="coder" rights="none" pattern="XPS" />
 </policymap>

And now, convert(1) can do everything I was accustomed to using it for.

Crashplan, though… I thought (from a few weeks back) that the service started so everything was good, but not so much! /usr/local/crashplan/log/engine_output.log ends like this:

  Java virtual machine created.
Starting service.
[08.20.22 10:24:56.599 INFO  main             com.code42.utils.ClassFinder] Loaded classpaths in 1960 ms
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fc570a458a7, pid=326423, tid=326535
#
# JRE version: OpenJDK Runtime Environment Temurin-11.0.12+7 (11.0.12+7) (build 11.0.12+7)
# Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.12+7 (11.0.12+7, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C  [libuaw.so+0x1c8a7]  std::filesystem::path::~path()+0x7
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /usr/local/crashplan/hs_err_pid326423.log
[thread 326523 also had an error]
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
#

WHOA, did you see that libuaw.so part up there? Do I maybe have a bad libuaw.so? It's late now, though, so maybe in another day or two I'll…

September 24 update: crashplan solved!

OK, so what happened last month was: I did more searching and read another reddit post which said that in 2019 (!) there was no excuse not to run apps in containers. Good point! But I didn't want to do that without learning more about containers. I mean, I can spell "runc exec -it <CONTAINERNAME> bash" (wait, did I get that right?) but beyond that…

I ordered and waited for my own personal copy of Docker: Up & Running 2/e then procrastinated… today I decided, better get to it. Now, where was that reddit post that pointed me at the container to…? Huh, couldn’t find it. Instead I re-found the reddit article linked above, but this time I noticed this comment:

Thanks a lot! Worked on Debian 10 with the ubuntu18 file and on Debian 11 with the ubuntu20 file.

WHOA; I’ve got debian11; which one did I install last month (which crashed)? Based on this output

$ ls -o /tmp/code42-install/nlib/*/libuaw.so
-rw-rw-r-- 1 collin 486008 Aug 21 08:25 /tmp/code42-install/nlib/rhel7/libuaw.so
-rw-rw-r-- 1 collin 243440 Aug 21 08:25 /tmp/code42-install/nlib/rhel8/libuaw.so
-rw-rw-r-- 1 collin 232944 Aug 21 08:25 /tmp/code42-install/nlib/ubuntu18/libuaw.so
-rw-rw-r-- 1 collin  52456 Aug 21 08:25 /tmp/code42-install/nlib/ubuntu20/libuaw.so
$

I evidently installed the rhel7 one. D’oh! Replacing it by the ubuntu20 one and crashplan is running, without having to do the container thing.

PS: the container thing is https://github.com/jlesage/docker-crashplan

Saturday, November 06, 2021

`__stack_chk_fail()`: What It Means

Recently I had a “stack smashing” incident to debug at work. It turned out to be a little more complicated than the example I'm about to show you, but at the bottom it was the same. Here's a silly example program.

collin@collin-t450:~/stack-chk$ pr -tn smash.c
    1	#include <stdio.h>
    2	#include <string.h>
    3	
    4	/*
    5	 * Bad programming practice
    6	 */
    7	static void
    8	oops(char const *buf)
    9	{
   10		char local[10];		/* if strlen(buf) > 9, then */
   11		strcpy(local, buf);	/* this line could smash the stack. */
   12		printf("%s\n", local);
   13	}
   14	
   15	/*
   16	 * this provides a level of indirection.
   17	 */
   18	static void
   19	doit(char const *buf)
   20	{
   21		oops(buf);
   22	}
   23	
   24	int
   25	main(int argc, char **argv)
   26	{
   27		char *msg = "hi there";
   28		if (argc > 1 && argv[1] && *argv[1]) {
   29			msg = argv[1];
   30		}
   31		doit(msg);
   32		return 0;
   33	}
collin@collin-t450:~/stack-chk$

So, main calls doit, passing either a short string—“hi there”—or a string of indeterminate length provided on the command line.

In turn, doit passes that same string to oops, which blindly copies it into a fixed-length buffer, local (line 11). This is a very bad practice because strcpy can overrun the destination (i.e. it can write past the end of local) if the source string (buf) is too long.

We compile it like this:

collin@collin-t450:~/stack-chk$ make smash
cc -fstack-protector-all -Wall -Werror -g    smash.c   -o smash
collin@collin-t450:~/stack-chk$

That -fstack-protector-all says to insert the stack-protector (or stack checking) code into every routine. This is a really good idea, and you should always have it in your makefiles.

Now if we run the program with a short string, all is well, but if the string is longer than about 9 bytes, bad things happen:

collin@collin-t450:~/stack-chk$ ./smash
hi there
collin@collin-t450:~/stack-chk$ ./smash hello
hello
collin@collin-t450:~/stack-chk$ ulimit -c unlimited       ←so we can get a coredump in case of abort
collin@collin-t450:~/stack-chk$ ./smash good-morning
good-morning
*** stack smashing detected ***: <unknown> terminated
Aborted (core dumped)
collin@collin-t450:~/stack-chk$

What is “stack smashing”, and how does the code tell that it’s happened? Let’s run gdb on the crash dump and see.

collin@collin-t450:~/stack-chk$ gdb smash core
GNU gdb (Debian 8.2.1-2+b3) 8.2.1
…copyright, GPL, hints, etc. here
Reading symbols from smash...done.
[New LWP 18026]
Core was generated by `./smash good-morning'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fcddd774535 in __GI_abort () at abort.c:79
#2  0x00007fcddd7cb508 in __libc_message (action=, 
    fmt=fmt@entry=0x7fcddd8d607b "*** %s ***: %s terminated\n")
    at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007fcddd85c80d in __GI___fortify_fail_abort (
    need_backtrace=need_backtrace@entry=false, 
    msg=msg@entry=0x7fcddd8d6059 "stack smashing detected") at fortify_fail.c:28
#4  0x00007fcddd85c7c2 in __stack_chk_fail () at stack_chk_fail.c:29
#5  0x0000556ce7f921a4 in oops (buf=0x7ffcbf978577 "good-morning") at smash.c:13
#6  0x0000556ce7f921cd in doit (buf=0x7ffcbf978577 "good-morning") at smash.c:21
#7  0x0000556ce7f9224d in main (argc=2, argv=0x7ffcbf977e68) at smash.c:31
(gdb)

Right. gdb’s “bt” command displays a backtrace; the above shows main calling doit calling oops, which called __stack_chk_fail. The numbers on the left are the frame numbers at the time the crash dump was taken.

I’ll belabor the maybe-obvious for a bit before continuing. Each frame is a record of where the caller expects to resume execution, when/if the callee returns; that is, the caller’s return-address is pushed onto the stack and then the machine begins executing the callee, in the new frame.

Let's see how __stack_chk_fail was called.

(gdb) f 5
#5  0x0000556ce7f921a4 in oops (buf=0x7ffcbf978577 "good-morning") at smash.c:13
13	}
(gdb) disass oops
Dump of assembler code for function oops:
   0x0000564f056b5155 <+0>:	push   %rbp
   0x0000564f056b5156 <+1>:	mov    %rsp,%rbp
   0x0000564f056b5159 <+4>:	sub    $0x30,%rsp
   0x0000564f056b515d <+8>:	mov    %rdi,-0x28(%rbp)
   0x0000564f056b5161 <+12>:	mov    %fs:0x28,%rax       put magic value → %rax
   0x0000564f056b516a <+21>:	mov    %rax,-0x8(%rbp)     stash %rax; → %rbp-8  
   0x0000564f056b516e <+25>:	xor    %eax,%eax
   0x0000564f056b5170 <+27>:	mov    -0x28(%rbp),%rdx
   0x0000564f056b5174 <+31>:	lea    -0x12(%rbp),%rax
   0x0000564f056b5178 <+35>:	mov    %rdx,%rsi
   0x0000564f056b517b <+38>:	mov    %rax,%rdi
   0x0000564f056b517e <+41>:	callq  0x564f056b5030 <strcpy@plt>
   0x0000564f056b5183 <+46>:	lea    -0x12(%rbp),%rax
   0x0000564f056b5187 <+50>:	mov    %rax,%rdi
   0x0000564f056b518a <+53>:	callq  0x564f056b5040 <puts@plt>
   0x0000564f056b518f <+58>:	nop
   0x0000564f056b5190 <+59>:	mov    -0x8(%rbp),%rax                          fetch saved magic value
   0x0000564f056b5194 <+63>:	xor    %fs:0x28,%rax                            xor vs real magic
   0x0000564f056b519d <+72>:	je     0x564f056b51a4 <oops+79>                 jump if saved still matches real magic
   0x0000564f056b519f <+74>:	callq  0x564f056b5050 <__stack_chk_fail@plt>    saved value got corrupted; abort
=> 0x0000564f056b51a4 <+79>:	leaveq 
   0x0000564f056b51a5 <+80>:	retq   
End of assembler dump.
(gdb)

The “=>” in the left-hand margin shows what we were about to execute in the frame—that is, the return point from calling __stack_chk_fail. But how did we decide to call it?

Let's go back to the beginning of oops. At the <+12> location, we move %fs:0x28 into %rax. What is %fs:0x28? I'm deducing from the usage that it holds a magic value which we store into %rbp-0x8, uh, I mean -0x8(%rbp)—at <+21>.

Then, at <+59>, we read -0x8(%rbp) into %rax; we xor it with %fs:0x28 at <+63>. If they are equal, the xor at +63 will set %rax to zero; then the je (“jump if equal”) at +72 sends us to the leaveq instruction. But if they are not equal, we call __stack_chk_fail.

To summarize, then, at the beginning of the routine, we store %fs:0x28 into %rbp-0x8; just before returning, we load the (64-bit) word in %rbp-0x8 and compare it to %fs:0x28. If it matches, we’re good, but if not, we call __stack_chk_fail. This stack checking code is inserted into every function—provided that

you use the compiler option -fstack-protector-all
the function can return (i.e., it doesn’t consist only of a no-break, no-return infinite loop)
the function call isn’t optimized out by the optimizer (e.g., compiled with -O0, or function isn’t declared static)

So what is at %rbp-0x8 here?

(gdb) x/xg $rbp-0x8
0x7ffcbf977d18:	0x88a84adec300676e
(gdb)

Alert readers may note that the low-order 3 bytes of the above (i.e., the 00676e) turn out to match the tail end of the string provided on the command line: “ng\0”; this is an effect of a bad programming practice: we wrote into a 10-byte buffer, but we wrote more than 10 bytes!

(gdb) info locals
local = "good-morni"
(gdb) p sizeof local
$1 = 10
(gdb) x/s local
0x7ffcbf977d0e:	"good-morning"
(gdb)

So by writing past the end of the 10-byte buffer “local[]”, we stomped (with “ng\0”) on the magic value used for stack check.

Now let’s have a look at the value(s) of %fs:0x28 stored elsewhere, starting one level “up,” that is, with oops’s caller:

(gdb) up
#6  0x0000556ce7f921cd in doit (buf=0x7ffcbf978577 "good-morning") at smash.c:21
21		oops(buf);
(gdb) x/8i doit
   0x556ce7f921a6 <doit>:	push   %rbp
   0x556ce7f921a7 <doit+1>:	mov    %rsp,%rbp
   0x556ce7f921aa <doit+4>:	sub    $0x20,%rsp
   0x556ce7f921ae <doit+8>:	mov    %rdi,-0x18(%rbp)
   0x556ce7f921b2 <doit+12>:	mov    %fs:0x28,%rax       put magic value → %rax
   0x556ce7f921bb <doit+21>:	mov    %rax,-0x8(%rbp)     stash %rax; → %rbp-8
   0x556ce7f921bf <doit+25>:	xor    %eax,%eax
   0x556ce7f921c1 <doit+27>:	mov    -0x18(%rbp),%rax
(gdb) x/xg $rbp-8
0x7ffcbf977d48:	0x88a84adec3f40c00
(gdb)

Now let's try one more.

(gdb) up
#7  0x0000556ce7f9224d in main (argc=2, argv=0x7ffcbf977e68) at smash.c:31
31		doit(msg);
(gdb) x/8i main
   0x556ce7f921e4 <main>:	push   %rbp
   0x556ce7f921e5 <main+1>:	mov    %rsp,%rbp
   0x556ce7f921e8 <main+4>:	sub    $0x20,%rsp
   0x556ce7f921ec <main+8>:	mov    %edi,-0x14(%rbp)
   0x556ce7f921ef <main+11>:	mov    %rsi,-0x20(%rbp)
   0x556ce7f921f3 <main+15>:	mov    %fs:0x28,%rax       put magic value → %rax
   0x556ce7f921fc <main+24>:	mov    %rax,-0x8(%rbp)     stash %rax; → %rbp-8  
   0x556ce7f92200 <main+28>:	xor    %eax,%eax
(gdb) x/xg $rbp-8
0x7ffcbf977d78:	0x88a84adec3f40c00
(gdb)

Now compare the above vs. what we had in $rbp-8 in frame 5:
0x7ffcbf977d78: 0x88a84adec3f40c00 ← frame 7
0x7ffcbf977d48: 0x88a84adec3f40c00 ← frame 6
0x7ffcbf977d18: 0x88a84adec300676e ← frame 5
Identical except for the low-order 3 bytes. The value of %fs:0x28 stored by main and doit match; the value in oops doesn’t. And that’s how the program knew there really was stack smashing.

A few more points

The stack checking code doesn’t always catch overruns. It did in this case because the variable named local was immediately below (i.e., lower memory address) the spot where the magic value was stashed away, and we overran local by a few bytes. But if we did something nastier in oops, like
local[59] = 'x';
then oops’s magic value would not have been disturbed. Probably doit’s magic value would have been detectably corrupted, and the backtrace would have shown doit, not oops, calling __stack_chk_fail
If local had been allocated via malloc(3) with that size, rather than being an on-stack variable, buffer overruns might be detected by bug-catching code in malloc or free, rather than code surrounding a call to __stack_chk_fail.

As alluded to earlier, if function(s) are declared static and the file is compiled with optimization, the corruption may occur in an “interior” or lower-level routine (a callee of a callee of…) but the stack-checking code may be present in only the caller. This is in fact what happened when I added “-O2” to the compilation command for smash.c

collin@collin-t450:~/stack-chk$ cc -fstack-protector-all -Wall -Werror -g -O2   smash.c   -o smash
collin@collin-t450:~/stack-chk$ ./smash good-morning
good-morning
*** stack smashing detected ***: <unknown> terminated
Aborted (core dumped)
collin@collin-t450:~/stack-chk$ gdb smash core
...
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from smash...done.
[New LWP 16201]
Core was generated by `./smash good-morning'.
Program terminated with signal SIGABRT, Aborted.
b#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f3d06aea535 in __GI_abort () at abort.c:79
#2  0x00007f3d06b41508 in __libc_message (action=, 
    fmt=fmt@entry=0x7f3d06c4c07b "*** %s ***: %s terminated\n")
    at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007f3d06bd280d in __GI___fortify_fail_abort (
    need_backtrace=need_backtrace@entry=false, 
    msg=msg@entry=0x7f3d06c4c059 "stack smashing detected")
    at fortify_fail.c:28
#4  0x00007f3d06bd27c2 in __stack_chk_fail () at stack_chk_fail.c:29
#5  0x000055a71abc70d9 in main (argc=<optimized out>, argv=)
    at smash.c:32
(gdb)

Thursday, November 04, 2021

Animations with gifsicle... plus complications

Some years ago, I discovered the excellent gifsicle, which I've used ~~a few~~ many times to create animated GIFs, like this sunrise at Zabreski Point. Today I did a slightly(?) more difficult one, made harder because I was snapping away with a phone, and about half-way through I changed the phone's orientation, switching from “portrait” (tall) to “landscape” (wide) images. It also didn't help that I know only the basics of image files. The good news, though, is that Imagemagick’s amazing convert(1) program has every capability we need to give gifsicle what it needs for a reasonable-looking animation. But I’m getting ahead of myself.

At first, I did what came naturally: downloaded the photos, converted them all to "gif"s, and told gifsicle to create the animated gif. Well, it was a disaster. Not a real disaster (no animals were harmed), but the animation started out in portrait mode and then switched to… Bad. The original files didn't tell me that though!

collin@collin-t450:/tmp/iCloud Photos$ file IMG_*.JPG | sed -e 's/Exif.*one 6,//' -e 's/xresolution.*precision 8,//'
IMG_0733.JPG: JPEG image data,  orientation=upper-right,  3264x2448, components 3
IMG_0734.JPG: JPEG image data,  orientation=upper-right,  3264x2448, components 3
IMG_0735.JPG: JPEG image data,  orientation=upper-right,  3264x2448, components 3
IMG_0736.JPG: JPEG image data,  orientation=upper-right,  3264x2448, components 3
IMG_0737.JPG: JPEG image data,  orientation=upper-right,  3264x2448, components 3
IMG_0738.JPG: JPEG image data,  orientation=upper-right,  3264x2448, components 3
IMG_0739.JPG: JPEG image data,  orientation=upper-right,  3264x2448, components 3
IMG_0740.JPG: JPEG image data,  orientation=upper-left,  3264x2448, components 3
IMG_0741.JPG: JPEG image data,  orientation=upper-left,  3264x2448, components 3
IMG_0742.JPG: JPEG image data,  orientation=upper-left,  3264x2448, components 3
IMG_0743.JPG: JPEG image data,  orientation=upper-left,  3264x2448, components 3
IMG_0744.JPG: JPEG image data,  orientation=upper-left,  3264x2448, components 3
collin@collin-t450:/tmp/iCloud Photos$

The “sed ...” removes a bunch of “TIFF image data...” stuff that was the same for all the files; I wanted the above output more readable. The thing I want you to notice is that all the image files are supposedly 3264x2248 pixels. Now there is a clue in the “orientation=” stuff, but as I said, I know only very basic stuff about these things.

Now part of my process (i.e., when doing what came naturally) was to convert these JPEG files to GIFs. I think I did something like

for F in IMG*JPG; do convert $F ${F%.JPG}.gif; done

After that, the switch from portrait to landscape became more obvious:

collin@collin-t450:/tmp/iCloud Photos$ file IMG*gif
IMG_0733.gif: GIF image data, version 89a, 2448 x 3264
IMG_0734.gif: GIF image data, version 89a, 2448 x 3264
IMG_0735.gif: GIF image data, version 89a, 2448 x 3264
IMG_0736.gif: GIF image data, version 89a, 2448 x 3264
IMG_0737.gif: GIF image data, version 89a, 2448 x 3264
IMG_0738.gif: GIF image data, version 89a, 2448 x 3264
IMG_0739.gif: GIF image data, version 89a, 2448 x 3264
IMG_0740.gif: GIF image data, version 89a, 3264 x 2448    ←landscape begins here
IMG_0741.gif: GIF image data, version 89a, 3264 x 2448
IMG_0742.gif: GIF image data, version 89a, 3264 x 2448
IMG_0743.gif: GIF image data, version 89a, 3264 x 2448
IMG_0744.gif: GIF image data, version 89a, 3264 x 2448
collin@collin-t450:/tmp/iCloud Photos$

Now the good news is that when I switched from portrait to landscape, Sheri's head remained roughly centered and about the same distance from the top of the image. To cut to the chase, I wrote a shell “one-liner” like this:
collin@collin-t450:/tmp/iCloud Photos$ for F in IMG_07*gif; do NEW=new${F#IMG_}; if file $F | grep "3264 x"; then C='2448x2448+408+0!' ; else C='2448x2448+0+0!'; fi; convert $F -crop $C +repage -resize 50% -remap IMG_0733.gif $NEW; done
which I'll explain tersely because it's time to go do something with the lovely Carol.

NEW=new${F#IMG_} makes $NEW to be $F except with new replacing IMG_. So new0733.gif for example
we check for landscape (those 3264 x 2448 images), and change the cropping parameter to fit the orientation of the original; that's what the C= stuff is.
we need +repage to make the “canvas size” fit the image boundaries. No, I don't really know what that means. But if I don't do it, everything looks weird.
-remap is so that all the new files will share the same colormap. Because gifsicle requires that.

then to make the animation,

collin@collin-t450:/tmp/iCloud Photos$ gifsicle -d 20 new*gif -o foo.gif

you can see the result on facebook if you’re Sheri’s “friend” there.

Sunday, April 25, 2021

How many pages per character? (not a typo)

NOTE: if the tables below look smooshed, try clicking on the title above ("How many pages...")

Suppose you're writing a novel, or some other large-ish prose, and you want to balance the number of pages that your main characters or points-of-view (etc.) get. Or you just want to see how many pages each character (etc.) gets. If your magnum opus were a half-dozen pages or so, you could print it out and mark it up with colored highlighters or something, but if it's several hundreds of pages, with several dozen chapters, you might want to make a table.

action	#pages
Adam adapts	2
Emily eats	7
Harry harrumphs	1
Kayla kicks	8
Emily investigates	2
Harry gets a haircut	8

That's all well and good, but if the table has a hundred (or hundreds) of rows, how do you figure out the distribution of pages among these characters? You could split the #pages column into multiple columns—one for each of your characters, like this:

action	Adam	Emily	Harry	Kayla
Adam adapts	2
Emily eats		7
Harry harrumphs			1
Kayla kicks				8
Emily investigates		2
Harry gets a haircut			8

Then you can tell LibreOffice Calc or Mi¢ro$oft Excel™ to add up each column, and you can compare the column totals, like this maybe

action	Adam	Emily	Harry	Kayla
Adam adapts	2
Emily eats		7
Harry harrumphs			1
Kayla kicks				8
Emily investigates		2
Harry gets a haircut			8
totals	2	9	9	8

But wait; this replaces one problem with another. What if the next scene/whatever you want to add is about Adam running into some issue: 5 pages, but you inadvertently add his #pages to the wrong column?

action	Adam	Emily	Harry	Kayla
Adam adapts	2
Emily eats		7
Harry harrumphs			1
Kayla kicks				8
Emily investigates		2
Harry gets a haircut			8
Adam breaks a leg		5
totals	2	14←oops	9	8

Here's an idea: this spreadsheet is on a computer, right? Aren't computers good at things like putting a number into a particular column, depending on the contents of another spreadsheet cell?

After taking way too long to figure this out, I thought I'd share with you what I did: I created a spreadsheet that looks like the first figure above, where we populate the #pages column. Then, consulting multiple web-search results, constructed formulas to copy the numbers in the #pages column into the appropriate cells in the "Adam", "Emily", etc. columns, depending on the contents of the action column. In the following table, the numbers under Adam, Emily, etc., are all computed by the spreadsheet:

action	#pages	Adam	Emily	Harry	Kayla
Adam adapts	2	2
Emily eats	7		7
Harry harrumphs	1			1
Kayla kicks	8				8
Emily investigates	2		2
Harry gets a haircut	8			8
totals	28	2	9	9	8
	total→	28

As a bonus, we can add up the "Adam", "Emily", etc., pages; if the sum of all those columns matches the sum of the #pages column, that suggests that maybe each of those #pages numbers matched exactly one of "Adam", "Emily", etc.; we didn't miss any and we didn't double-count any. (We could miss by mis-spelling someone's name, and we could double-count by inadvertently including someone's name in the middle of a word. "Kayla reviews Madame Bovary" for example could match both "Adam" and "Kayla"; or if you wrote "Emily and Adam play squash" as one of the actions.) The spreadsheet uses a case-insensitive search to match the character names. There is probably an easy way to make that case-sensitive, but I'll leave that as an exercise for the reader.

Here's how you can use it.

Download the libreoffice/openoffice file or the Excel™ file
Replace "Adam", "Emily", etc., by your novel's (principal?) characters' names. If you have more than four characters:
1. Put additional names to the right of Kayla in row 1
2. Select C2–F2 and extend rightward to G2, H2, etc. to accommodate your additional characters
3. Select B8–F8 and extend rightward to G8, H8, etc. to accommodate your additional characters
Delete rows 3–6 (leave "Harry gets a haircut" as a placeholder; you'll see why shortly)
Insert a dozen or a hundred rows before the "Harry gets a haircut" placeholder row, and populate columns C3–H103 (or however far to the right) by extending C2–H2 (or however…) downward to cover the new rows.
As you insert action cells (start by replacing "Adam adapts") and fill in the corresponding value in the #pages column, the totals in the totals row should automagically update, because that row has cells that SUM everything from row B ("Adam adapts") up to and including the "Harry gets a haircut" row.
[If I had planned further ahead, I might have called that row "leave empty for expansion" or something]
To make the error-checking thing work for your expanded spreadsheet, ensure that the cell to the right of "total→" is the sum of the previous row's columns B–H (or however…)

Sunday, December 06, 2020

A lucky hdd recovery adventure

In all my years of home computing, I never had a catastrophic hard drive failure. Saturday's excitement felt like one, with I/O errors on the journal.

The usual graphical display on my computer was replaced by large white typewriter-style characters on a dark background, dire warnings about I/O errors, nonexistent device on remount, and other words I don't recall. I found a USB drive that still had knoppix 8.6, and booted. Diagnostics (CPU, memory) found no problems so I tried "e2fsck -y" on the failed partitions (both / and /home, which made me think "electronics" rather than "bad spot on the disk"). That ended with "Filesystem still contains errors!"

I tried to remove the journal with tune2fs, which insisted on first trying to replay the journal, that's right, the very journal I wanted to ignore/remove. After a few hours of futility, I had an idea. Since there were I/O errors on the disk drive, but all were happening at rather high block numbers (with at least 9 digits if I recall correctly), I thought, what if I just did a block-for-block copy onto a new drive?

Off to the store for a 1TB internal SSD with SATA connectors: about $100—a dime per gigabyte! This is an amazing time we live in. "fdisk -l" gave me the bad drive's layout; I partitioned the new drive with the same-size partitions... I think the new drive is slightly larger, but who cares? I copied the filesystem partitions over, and was pleasantly surprised to note that the filesystem uuids matched. (D'oh!) The root partition (just 60GB) was done in less than 10 minutes:

Linuxknoppix@Microknoppix:~$ dd if=/dev/sda1 of=/dev/sdb1 bs=1M
61440+0 records in
61440+0 records out
64424509440 bytes (64 GB, 60 GiB) copied, 408.774 s, 158 MB/s
knoppix@Microknoppix:~$

Muttering "Let fortune favor the foolish," I typed:

root@Microknoppix:/home/knoppix# e2fsck -y /dev/sdb1
e2fsck 1.44.5 (15-Dec-2018)
/dev/sdb1: clean, 309900/3909120 files, 5462752/15728640 blocks
root@Microknoppix:/home/knoppix# mount /dev/sdb1 /mnt
root@Microknoppix:/home/knoppix# ls /mnt
bin   etc    home            lib         media  proc  sbin  tmp  vmlinuz
boot  extra  initrd.img      lib64       mnt    root  srv   usr  vmlinuz.old
dev   foo    initrd.img.old  lost+found  opt    run   sys   var
root@Microknoppix:/home/knoppix#

That may have been a little foolhardy, but I went to bed hoping for similar grace to befall on /home (over 800GB). The next morning, the copy was done. The news was happy:

root@Microknoppix:/home/knoppix# fdisk -l /dev/sdb
Disk /dev/sdb: 931.5 GiB, 1000207286272 bytes, 1953529856 sectors
Disk model: SanDisk SSD PLUS
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x5fa89795

Device     Boot     Start        End    Sectors   Size Id Type
/dev/sdb1  *         2048  125831167  125829120    60G 83 Linux                 ← root partition
/dev/sdb2       125831168 1953529855 1827698688 871.5G  5 Extended
/dev/sdb5       125833216  159387647   33554432    16G 82 Linux swap / Solaris
/dev/sdb6       159389696 1953529855 1794140160 855.5G 83 Linux                 ← /home
root@Microknoppix:/home/knoppix# e2fsck -fy /dev/sdb6
e2fsck 1.44.5 (15-Dec-2018)
Pass 1: Checking inodes, blocks, and sizes
Inode 33692836 extent tree (at level 1) could be shorter.  Optimize? yes

Pass 1E: Optimizing extent trees
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/sdb6: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sdb6: 299071/56074240 files (1.4% non-contiguous), 54502379/224266934 blocks
root@Microknoppix:/home/knoppix# e2fsck -fy /dev/sdb1
e2fsck 1.44.5 (15-Dec-2018)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb1: 309900/3909120 files (0.4% non-contiguous), 5462752/15728640 blocks
root@Microknoppix:/home/knoppix#

Initial checks suggest that nothing important was lost. If something important had been lost, it would have been mightily inconvenient, but not really catastrophic. That said, I decided to add this device to my crashplan subscription.

Monday, August 31, 2020

pdftk to the rescue (or: how to sign a PDF using a pen)

From the manpage: “If PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a simple tool for doing everyday things with PDF documents.”

True, but the usage isn't always intuitive. Well, maybe it would be if I did these things every day. Which I don't. This one thing, that I have to do every few months (trying to make it less frequent), goes something like this:

Receive a PDF; call it orig.pdf
print page 3, sign, and scan; call that sigs.png
Create a new PDF which is the old one for pages 1–2, the scanned image from the previous step, and the old one for page 4; save as signed.pdf

Here is my cheat sheet for next time:

$ convert sigs.png sigs.pdf
$ pdftk A=orig.pdf B=sigs.pdf cat A1-2 B1 A4 output signed.pdf

There. Now the next time I have to do it, I won't have to stare at the manpage and try to reconstruct this incantation.

Monday, July 13, 2020

Debian9 upgrade

I mentioned in an earlier post that upgrading from debian8 (jessie) to 9 (stretch) solved a display mystery. (I wonder if a 3440x1440 monitor will work on this distro.) This post will record mysteries created, rather than solved.

xsane: no devices found
So I had to download brscan4, I forget where from... brscan4-0.4.9-1.amd64.deb
Then I had to use brsaneconfig. Like this:
```
sudo brsaneconfig4 -a name=mfc9340 model=MFC-9340CDW ip=192.168.1.40
```
…that's it for now!

I understand that upgrading to buster (debian10) will remove python2. Since I'm a Python Dinosaur (not really analogous to my granddaughter's "Dragon Pig" concept), python3 is the new-fangled (or -fanged) thing that I don't quite feel comfortable with.

And then there was an broadband internet service change 2020-08-14
"ping" tried to use ipv6 addresses. So I did this:

collin@p64:~$ cat /etc/sysctl.d/local.conf 
net.ipv6.conf.all.disable_ipv6 = 1
collin@p64:~$

And after filling in DNS servers in /etc/resolv.conf, I made it not a symlink and told network manager not to update it
BUT! That was a bad idea.

[main]
plugins=ifupdown,keyfile
dns=none                       ←add
...
/etc/NetworkManager/NetworkManager.conf

Why was it a bad idea? Because I thought I could use my ISP's nameservers. But since my IP address was assigned by AT&T, my ISP's nameservers rejected all my queries. I hate rejection. So I'll stop asking.

Wednesday, July 01, 2020

Solved!! LG 34UM58-P Revisited: a Puzzle

In 2017, I bought an LG 34um58p monitor and, after some xrandr(1) hackery, got the display to work, more or less. But the display was always a little ugly, as described in a 2019 whine.

Today, for unrelated reasons, I upgraded my Linux distro from Debian GNU/Linux 8 (jessie) to Debian GNU/Linux 9 (stretch). I rebooted and Surprise! First, I didn't need to run my xrandr script. Second, the pixels actually look good; no smearing/dithering as shown in the aforementioned whine from 2019.

I was so excited to see the new sharper image (no ™) that I whipped out my iPhone™ camera and snapped the pic at right. Although I didn't set up the camera as carefully as I did for last year's shot, I hope you can see how much nicer it looks now.

Saturday, March 28, 2020

Data Recovery... part deux

Trying again, this time using a Tripp-Lite SATA↔usb adapter cable. I had no joy the first few times I plugged this into my Linux box, but then I ran

collin@p64:~$ udevadm monitor --udev
monitor will print the received events for:
UDEV - the event which udev sends out after rule processing

UDEV  [531312.705378] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4 (usb)
UDEV  [531312.708846] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0 (usb)
UDEV  [531312.709620] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host7 (scsi)
UDEV  [531312.710318] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host7/scsi_host/host7 (scsi_host)
UDEV  [531313.701247] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host7/target7:0:0 (scsi)
UDEV  [531313.701815] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host7/target7:0:0/7:0:0:0 (scsi)
UDEV  [531313.702273] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host7/target7:0:0/7:0:0:0/scsi_disk/7:0:0:0 (scsi_disk)
UDEV  [531313.703261] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host7/target7:0:0/7:0:0:0/scsi_device/7:0:0:0 (scsi_device)
UDEV  [531313.703584] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host7/target7:0:0/7:0:0:0/bsg/7:0:0:0 (bsg)
UDEV  [531313.703599] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host7/target7:0:0/7:0:0:0/scsi_generic/sg6 (scsi_generic)
UDEV  [531313.705480] add      /devices/virtual/bdi/8:96 (bdi)
UDEV  [531314.784064] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host7/target7:0:0/7:0:0:0/block/sdg (block)
UDEV  [531314.855365] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host7/target7:0:0/7:0:0:0/block/sdg/sdg1 (block)

Nice to see that. Next was:

collin@p64:~$ sudo fdisk -l /dev/sdg

Disk /dev/sdg: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xcea3ed1a

Device     Boot Start       End   Sectors   Size Id Type
/dev/sdg1  *     2048 732565503 732563456 349.3G af HFS / HFS+

Right, so the partition table matches what I saw last time.

After realizing that MacOS thinks the disk only holds about 349GB, I realized that the partition table really is horked; it's not a mere incompatibility with the bsd label business, which I'm not even sure is still a thing. It's been like 20 years since I partitioned (or labeled) a *bsd disk so I'm likely behind the times.

I wrote a couple of articles for Linux Journal, apparently in 2005. This one received a comment around that time of, "why didn't you just use gpart?" or maybe gparted. Uh, because of total ignorance? As Sheri says, "Dad knows the hard way to do everything."

Let's see if I can do this the easy way. I said sudo gpart /dev/sdg and went to lunch. An hour later, all that was on my screen was Begin scan... so I guess not. So we can't do this the easy way; we're gonna do it the hard way. Sheri might be right, that is if I'm successful.

Referring to the part of Apple's Technical Note TN1150 showing the volume header format and examining the bytes... from my earlier attempt... OK, one step at a time. I figured out at that time that the partition started 2048 blocks in, where each block is 4096 bytes. The volume header is 1024 bytes in from there. So if the block size is 1024 bytes, we can look 8193 1KB blocks in, we should find the header there:

collin@p64:~$ sudo dd if=/dev/sdg bs=1024 skip=8193 count=1 status=none | hexdump -C
00000000  48 2b 00 04 80 00 21 00  48 46 53 4a 00 00 15 d7  |H+....!.HFSJ....|
00000010  d9 35 08 4d da 54 1b 92  00 00 00 00 d9 35 6a bd  |.5.M.T.......5j.|
00000020  00 30 c1 f8 00 0b 41 4c  00 00 20 00 15 d5 04 00  |.0....AL.. .....|
00000030  0a f1 a3 84 0a ef 00 00  00 01 00 00 00 01 00 00  |................|
00000040  00 3c 44 75 00 1e 9a f1  00 00 00 00 00 00 00 01  |.<Du............|
00000050  00 00 00 02 00 10 2d b8  00 00 00 00 00 00 00 00  |......-.........|
00000060  00 00 00 00 00 00 00 00  7d c5 0b 14 d5 a8 92 17  |........}.......|
00000070  00 00 00 00 02 ba c0 00  02 ba c0 00 00 00 15 d6  |................|
00000080  00 00 00 01 00 00 15 d6  00 00 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  00 00 00 00 01 00 00 00  01 00 00 00 00 00 08 00  |................|
000000d0  00 00 85 d8 00 00 08 00  00 00 00 00 00 00 00 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000110  00 00 00 00 69 a0 00 00  15 20 00 00 00 03 4d 00  |....i.... ....M.|
00000120  00 07 d0 d8 00 03 4d 00  00 00 00 00 00 00 00 00  |......M.........|
00000130  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000160  00 00 00 00 a9 00 00 00  15 20 00 00 00 05 48 00  |......... ....H.|
00000170  00 00 8d d8 00 05 48 00  00 00 00 00 00 00 00 00  |......H.........|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400

Let's interpret this.

signature is "H+"
version is 4
attributes: 0x80002100; no idea what that means
lastMountedVersion: "HFSJ"
journalInfoBlock: 0x15d7; no idea about that either
(offset 0x10)

createDate: This and the following dates are in the format described here; I translate them into localtime using this one-liner:

h2d() { python -c "import time; hs='0x$1$2$3$4'; print time.ctime(int(hs, base=0) - 2082844800)"; }

whence

collin@p64:~$ h2d d9 35 08 4d
        Sun Jun 23 03:43:25 2019

modifyDate

collin@p64:~$ h2d  da 54 1b 92
        Sun Jan 26 20:46:10 2020

backupDate: all zeroes.

checkedDate

collin@p64:~$ h2d d9 35 6a bd
        Sun Jun 23 10:43:25 2019

(offset 0x20)

fileCount: 0x30c1f8 or 3195384. Three million files.
folderCount: 0xb414c or 737612. Average of what, 4 files per directory?
blockSize 0x2000, that is 8K
totalBlocks 0x15d50400, or 366281728.

Okay, I think that's all I care about. That last was very interesting; it says the partition ought to be 366281728 blocks (block = 8K; by 'K' I mean 1024). If we started at 1024 (8KB-sized) blocks from the start of the disk, and the end of the partition is 366281728 blocks later, that's, umm, 366282752 8K blocks. I mean that ...2752 number is the first 8K-block after the end of the partition.

So let me see, how many 1K blocks would that be? 2930262016 1K-blocks. If we want to look at the last-but-one, then this command ought to give me the trailing volume header:

collin@p64:~$ sudo dd if=/dev/sdg bs=1024 skip=2930262015 count=1 status=none | hexdump -C
00000000  48 2b 00 04 80 00 20 00  48 46 53 4a 00 00 15 d7  |H+.... .HFSJ....|
00000010  d9 35 08 4d da 54 0b 51  00 00 00 00 d9 35 6a bd  |.5.M.T.Q.....5j.|
00000020  00 2c 98 1f 00 0a 81 3b  00 00 20 00 15 d5 04 00  |.,.....;.. .....|
00000030  0b 00 70 9f 0a dd b9 b7  00 01 00 00 00 01 00 00  |..p.............|
00000040  00 37 5a 59 00 1d 85 1d  00 00 00 00 00 00 00 01  |.7ZY............|
… you get the idea …

Glad that worked. This also agrees with the 5860524030 number (of 512-byte blocks) used in the second dd command on my earlier post (i.e., it's half that). Whew!

OK, so let me save off the first 4K of the drive, just in case I totally hork this. Also let me make sure I understand what I think it says.

collin@p64:~$ sudo dd if=/dev/sdg of=sheri-backup-hdd-4K.data bs=4k count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.236593 s, 17.3 kB/s
collin@p64:~$ hexdump -C sheri-backup-hdd-4K.data
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001b0  00 00 00 00 00 00 00 00  1a ed a3 ce 00 00 80 fe  |................|
000001c0  ff ff af fe ff ff 00 08  00 00 00 08 aa 2b 00 00  |.............+..|
000001d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000

A web search led me to https://wiki.osdev.org/Partition_Table, where I found that this disk has just one partition defined, starting at byte 0x1be. The numbers are all little-endian.

bootable yes
starting head 0xfe (what)
starting sector 0x3f (high bits are for starting cylinder)
starting cylinder 0x3ff
system ID 0xaf, which this page says is "MacOS X HFS." Cool.
ending head 0xfe
ending sector 0x3f
ending cylinder 0x3ff
relative sector to start of ptn: 0x800
total sectors in partition 0x2baa0800 = 732563456

Now that last number, 0x2baa0800 or 732563456, matches the number of "sectors" that fdisk thought this drive had. So fdisk(1) was almost right.

Meanwhile, that wiki.osdev.org page notes that since CHS fields are useless on almost all current drives; the CHS fields are set as above (an invalid setting). That was interesting but not, as they say, actionable. What am I supposed to do to fix this? Maybe testdisk? It currently says this:

TestDisk 6.14, Data Recovery Utility, July 2013
Christophe GRENIER 
http://www.cgsecurity.org

Disk /dev/sdg - 3000 GB / 2794 GiB - CHS 364801 255 63
     Partition               Start        End    Size in sectors
* HFS                      1   5  5 364800 190 62 5860507648











Structure: Ok.  Use Up/Down Arrow keys to select partition.
Use Left/Right Arrow keys to CHANGE partition characteristics:
*=Primary bootable  P=Primary  L=Logical  E=Extended  D=Deleted
Keys A: add partition, L: load backup, T: change type,
     Enter: to continue
HFS+ blocksize=8192, 3000 GB / 2794 GiB

So I hit <Enter>, and it gave me the option to "Write", so I said yes and exited. It says I have to reboot for the change to take effect. Well, how about if I just unplug the disk?

Well, it looks like I can't just unplug it and plug it back in again. I have to wait some time... maybe about 3 minutes? Then the system will recognize the disk and re-read the partition table, etc. Like this

collin@p64:~$ udevadm monitor --udev
monitor will print the received events for:
UDEV - the event which udev sends out after rule processing

UDEV  [540827.342967] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4 (usb)
UDEV  [540827.377682] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0 (usb)
UDEV  [540827.378275] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host8 (scsi)
UDEV  [540827.378709] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host8/scsi_host/host8 (scsi_host)
UDEV  [540828.293126] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host8/target8:0:0 (scsi)
UDEV  [540828.293685] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host8/target8:0:0/8:0:0:0 (scsi)
UDEV  [540828.294498] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host8/target8:0:0/8:0:0:0/scsi_disk/8:0:0:0 (scsi_disk)
UDEV  [540828.295261] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host8/target8:0:0/8:0:0:0/scsi_device/8:0:0:0 (scsi_device)
UDEV  [540828.295499] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host8/target8:0:0/8:0:0:0/scsi_generic/sg6 (scsi_generic)
UDEV  [540828.295630] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host8/target8:0:0/8:0:0:0/bsg/8:0:0:0 (bsg)
UDEV  [540828.296220] add      /devices/virtual/bdi/8:96 (bdi)
UDEV  [540828.674004] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host8/target8:0:0/8:0:0:0/block/sdg (block)
UDEV  [540828.792427] add      /devices/pci0000:00/0000:00:1a.7/usb7/7-4/7-4:1.0/host8/target8:0:0/8:0:0:0/block/sdg/sdg1 (block)
UDEV  [540829.220135] add      /module/hfsplus (module)
UDEV  [540829.230641] add      /module/nls_utf8 (module)

Cool. Next:

collin@p64:~$ sudo fdisk -l /dev/sdg

Disk /dev/sdg: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xcea3ed1a

Device     Boot Start        End    Sectors Size Id Type
/dev/sdg1  *    16384 4294983678 4294967295   2T af HFS / HFS+

collin@p64:~$

So let fortune favor the foolish

collin@p64:~$ sudo mount -o ro /dev/sdg1 /foo/sdb1
mount: wrong fs type, bad option, bad superblock on /dev/sdg1,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
collin@p64:~$

Well. dmesg told me this near the end:

[540827.160035] usb 7-4: new high-speed USB device number 6 using ehci-pci
[540827.293356] usb 7-4: New USB device found, idVendor=1f75, idProduct=0611
[540827.293359] usb 7-4: New USB device strings: Mfr=4, Product=5, SerialNumber=6
[540827.293361] usb 7-4: SerialNumber: 20181129
[540827.293640] usb-storage 7-4:1.0: USB Mass Storage device detected
[540827.293953] scsi8 : usb-storage 7-4:1.0
[540828.292492] scsi scan: INQUIRY result too short (5), using 36
[540828.292499] scsi 8:0:0:0: Direct-Access     ST3000DM 001-9YN166            PQ: 0 ANSI: 0
[540828.292806] sd 8:0:0:0: Attached scsi generic sg6 type 0
[540828.293492] sd 8:0:0:0: [sdg] Very big device. Trying to use READ CAPACITY(16).
[540828.293868] sd 8:0:0:0: [sdg] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
[540828.294986] sd 8:0:0:0: [sdg] Write Protect is off
[540828.294989] sd 8:0:0:0: [sdg] Mode Sense: 3b 00 00 00
[540828.295985] sd 8:0:0:0: [sdg] No Caching mode page found
[540828.295989] sd 8:0:0:0: [sdg] Assuming drive cache: write through
[540828.297608] sd 8:0:0:0: [sdg] Very big device. Trying to use READ CAPACITY(16).
[540828.538767]  sdg: sdg1
[540828.539863] sd 8:0:0:0: [sdg] Very big device. Trying to use READ CAPACITY(16).
[540828.542869] sd 8:0:0:0: [sdg] Attached SCSI disk
[540829.251371] hfsplus: invalid secondary volume header
[540829.251377] hfsplus: unable to find HFS+ superblock
[541163.757363] hfsplus: invalid secondary volume header
[541163.757366] hfsplus: unable to find HFS+ superblock

Well, that is of course disappointing. Let's see whether MacOS can read it.

YES!

So I plugged the drive into macbook, and it was mounted immediately. MacOS apparently is a bit more tolerant of the placement of the backup volume header table—or rather, the advertised partition size. Probably it read the "first" Volume Header and believed the size it found there. And the placement of the backup volume header was consistent with that. So MacOS is happily copying files over to a new drive.

What if macOS hadn't been able to read the drive? Well, just before trying, I noticed that the size of the partition reported by testdisk—i.e., 5860507648 sectors—didn't match the partition size I calculated, viz. 5860524032 512-byte blocks. So if MacOS had still balked, I would have taken the drive back to my Linux box and attempted to tweak the size in the partition table. Fortunately it didn't come to that. Whew!

Saturday, March 21, 2020

Data recovery: Seagate SRD0SD1

This external backup drive is at least seven years old. Some days ago, it just wouldn't power itself on. Could I fix it?

My first hope, soon dashed, was that it might the power supply. It would have been perhaps the easiest fix. But I determined that 12 volts were being supplied, and that the voltage didn't drop when the supply was plugged securely into the drive.

Next came some web searches. "Research" would not really be a fair way to characterize this, but maybe it would pass for research these days :-). I found "Danny"’s totally excellent teardown video, which enabled me to get the drive out of the enclosure.

I have a desk~~top~~side computer with SATA drives. I connected the Seagate 3TB drive in place of one of the existing drives, and tried, as superuser, "mount /dev/sdb /foo/sdb"

No joy. More web searching and after "apt-get install hfsprogs" I got this at the end of dmesg:

[67539.205183] hfsplus: unable to find HFS+ superblock
[67547.971602] hfs: can't find a HFS filesystem on dev sdb1

Well, let's have a look at the drive

collin@p64:~$ sudo fdisk -l /dev/sdb

Disk /dev/sdb: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0xcea3ed1a

Device     Boot Start       End   Sectors   Size Id Type
/dev/sdb1  *     2048 732565503 732563456 349.3G af HFS / HFS+

collin@p64:~$

So that's totally bogus; this is a 3TB drive! Then I remembered some work I did with netbsd around 2001; *BSD systems apparently use “labels” rather than the partition table. So the ptn table is just so much random bits 8^(

I decided to troll around a little, and found this. Don't ask me how I came up with 2048 blocks (with a totally unwarranted 4K blocksize) but here's what struck me:

collin@p64:~$ sudo dd status=none if=/dev/sdb bs=4k skip=2048 count=1 | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400  48 2b 00 04 80 00 21 00  48 46 53 4a 00 00 15 d7  |H+....!.HFSJ....|
00000410  d9 35 08 4d da 54 1b 92  00 00 00 00 d9 35 6a bd  |.5.M.T.......5j.|
00000420  00 30 c1 f8 00 0b 41 4c  00 00 20 00 15 d5 04 00  |.0....AL.. .....|
00000430  0a f1 a3 84 0a ef 00 00  00 01 00 00 00 01 00 00  |................|
00000440  00 3c 44 75 00 1e 9a f1  00 00 00 00 00 00 00 01  |.<Du............|
00000450  00 00 00 02 00 10 2d b8  00 00 00 00 00 00 00 00  |......-.........|
00000460  00 00 00 00 00 00 00 00  7d c5 0b 14 d5 a8 92 17  |........}.......|
00000470  00 00 00 00 02 ba c0 00  02 ba c0 00 00 00 15 d6  |................|
00000480  00 00 00 01 00 00 15 d6  00 00 00 00 00 00 00 00  |................|
00000490  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000004c0  00 00 00 00 01 00 00 00  01 00 00 00 00 00 08 00  |................|
000004d0  00 00 85 d8 00 00 08 00  00 00 00 00 00 00 00 00  |................|
000004e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000510  00 00 00 00 69 a0 00 00  15 20 00 00 00 03 4d 00  |....i.... ....M.|
00000520  00 07 d0 d8 00 03 4d 00  00 00 00 00 00 00 00 00  |......M.........|
00000530  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000560  00 00 00 00 a9 00 00 00  15 20 00 00 00 05 48 00  |......... ....H.|
00000570  00 00 8d d8 00 05 48 00  00 00 00 00 00 00 00 00  |......H.........|
00000580  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000
collin@p64:~$

Okay, now that’s more like it. "H+" and "HFSJ" appear there. Searching around the web, I came upon the "Volume Header" section in Apple Technical Note TN1150, which begins:

Each HFS Plus volume contains a volume header 1024 bytes from the start of the volume. The volume header -- analogous to the master directory block (MDB) for HFS -- contains information about the volume as a whole, including the location of other key structures in the volume. The implementation is responsible for ensuring that this structure is updated before the volume is unmounted.
A copy of the volume header, the alternate volume header, is stored starting 1024 bytes before the end of the volume. The implementation should only update this copy when the length or location of one of the special files changes. The alternate volume header is intended for use solely by disk repair utilities.

Okay, so note how that "H+" begins at 0x400 (i.e., 1024) bytes from the start of this block? the vol begins 2048 blocks in, with a 4K block size, and this block is 1024 bytes in from there. What else?

Somewhere on the web I read the instructions to get testdisk, as in sudo apt-get install testdisk. I said sudo testdisk /dev/sdb and after some flailing I got this:

TestDisk 6.14, Data Recovery Utility, July 2013
Christophe GRENIER <grenier@cgsecurity.org>
http://www.cgsecurity.org

Disk /dev/sdb - 3000 GB / 2794 GiB - CHS 364801 255 63
     Partition               Start        End    Size in sectors
>P HFS                        16384 5860524031 5860507648

Okay, so if a sector is 512 bytes, then testdisk agrees with me that the vol starts 8MB in (MB=1024*1024 bytes). Let's see if we can find the backup/alternate vol header by subtracting 1 from that 5860524031 number:

collin@p64:~$ sudo dd status=none if=/dev/sdb bs=512 skip=5860524030 count=1 | hexdump -C
00000000  48 2b 00 04 80 00 20 00  48 46 53 4a 00 00 15 d7  |H+.... .HFSJ....|
00000010  d9 35 08 4d da 54 0b 51  00 00 00 00 d9 35 6a bd  |.5.M.T.Q.....5j.|
00000020  00 2c 98 1f 00 0a 81 3b  00 00 20 00 15 d5 04 00  |.,.....;.. .....|
00000030  0b 00 70 9f 0a dd b9 b7  00 01 00 00 00 01 00 00  |..p.............|
00000040  00 37 5a 59 00 1d 85 1d  00 00 00 00 00 00 00 01  |.7ZY............|
00000050  00 00 00 02 00 10 2d b8  00 00 00 00 00 00 00 00  |......-.........|
00000060  00 00 00 00 00 00 00 00  7d c5 0b 14 d5 a8 92 17  |........}.......|
00000070  00 00 00 00 02 ba c0 00  02 ba c0 00 00 00 15 d6  |................|
00000080  00 00 00 01 00 00 15 d6  00 00 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  00 00 00 00 01 00 00 00  01 00 00 00 00 00 08 00  |................|
000000d0  00 00 85 d8 00 00 08 00  00 00 00 00 00 00 00 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000110  00 00 00 00 69 a0 00 00  15 20 00 00 00 03 4d 00  |....i.... ....M.|
00000120  00 07 d0 d8 00 03 4d 00  00 00 00 00 00 00 00 00  |......M.........|
00000130  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000160  00 00 00 00 a9 00 00 00  15 20 00 00 00 05 48 00  |......... ....H.|
00000170  00 00 8d d8 00 05 48 00  00 00 00 00 00 00 00 00  |......H.........|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200
collin@p64:~$

Okay, now that's good to know. One other piece of info here. I saw on https://superuser.com/questions/657655/problems-with-mounting-hfs-drives that there is a program gdisk, somewhat like fdisk, can give somewhat more detail on the partition types. And that there are (at least) two partition types of interest:

While the answer provided by mcy should work if the partition is actually an HFS+ partition, starting with OSX Yosemite the default partition type for a Mac is "Core Storage", which is used to handle logical volumes. This means that what you actually want to mount is a logical volume (using HFS+ filesytem) inside the "Core Storage" partition.
To see if your partition is of type "Apple Core Storage" you can use gdisk: AF05 is the code for "Apple Core Storage", while af00 is the code for "Apple HFS/HFS+".

What is this gdisk of which you speak? type gdisk said "not found" so I said:

collin@p64:~$ sudo apt-get install gdisk
Reading package lists... Done
Building dependency tree       
Reading state information... Done
gdisk is already the newest version.
gdisk set to manually installed.
The following packages were automatically installed and are no longer required:
  python-cffi python-colorama python-cryptography python-distlib
  python-ndg-httpsclient python-openssl python-ply python-pyasn1
  python-pycparser python-requests python-urllib3 python-wheel
Use 'apt-get autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 455 not upgraded.
collin@p64:~$ type gdisk
bash: type: gdisk: not found
collin@p64:~$

Hurmpf. Was it...? Yes it was. Here:

collin@p64:~$ sudo /sbin/gdisk -l /dev/sdb
GPT fdisk (gdisk) version 0.8.10
...
***************************************************************
Found invalid GPT and valid MBR; converting MBR to GPT format
in memory. 
***************************************************************

Disk /dev/sdb: 5860533168 sectors, 2.7 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): C599270C-2980-42BA-996C-7BF534EE6702
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 5860533134
Partitions will be aligned on 2048-sector boundaries
Total free space is 5127969645 sectors (2.4 TiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048       732565503   349.3 GiB   AF00  Apple HFS/HFS+
collin@p64:~$

Well, it still seems silly about the 349.3GBytes, but it says the partition code is the HFSplus one, not the core storage one. That's good news at least.

But not good enough I think. Note that even gdisk thinks we ought to be starting at sector 2048 and that the partition is only about 350gb in size. I noted "offset" and "sizelimit" options in https://superuser.com/questions/961401/mounting-hfs-partition-on-arch-linux/1088110, but mount(8) tells me that those are losetup options only:

       The mount command automatically creates a loop device  from  a  regular
       file  if  a filesystem type is not specified or the filesystem is known
       for libblkid, for example:

              mount /tmp/disk.img /mnt

              mount -t ext3 /tmp/disk.img /mnt

       This type of mount knows about three options, namely loop,  offset  and
       sizelimit,  that  are really options to losetup(8).  (These options can
       be used in addition to those specific to the filesystem type.)

I don't want to hack the partition table, as I might totally b0rk it; another idea involves getting an adapter, sata↔USB, and just try mounting it on a mac. It would have to be one with a power supply. Maybe something from newegg.com

Monday, September 09, 2019

LG 34UM58-P Revisited: a Puzzle

I wrote earlier about this LG ultrawide monitor: 2560x1080 pixels, about 34" diagonal. Although it's great having all the pixels, the screen didn't look all that good on my Linux machine.

I created the image file you see at left. The squares in the upper-left corner have the red pixel turned on at x=0, 5, 10, 15, etc., and at y=0, 5, 10, 15, etc. I wanted to have just one of the color-dots on in each pixel, and I thought a black background would show up better what was going on.

I ran the display(1) program (from ImageMagick) and pulled out my camera. The picture below shows the sad story.

As you can see, the horizontal lines look more or less like lines, but the vertical lines are a mess. Before I examined the screen carefully, I had assumed the card didn't have enough horizontal pixels; maybe it had just 1920 per line, and it was dithering or something to decide what signal to send. But then I thought we ought to get a clean vertical line at some point.

So this is a puzzle. When I hooked this monitor up to the lovely Carol's macbook, the display looked great, not muddy at all. I wondered if it was the cable; I bought a new one with better specs. Maybe it looks a little better, but the photo was shot while using the new cable. Maybe I should be using the card's DVI outputs rather than the mini-HDMI output? I have no idea.

Sunday, December 17, 2017

Life imitates code

Sometimes life imitates art, as they say; this is a case where life imitates code.

It started a half-century ago at least (the story, not the code). Way back when I was in elementary school, Dad gave me the idea of making “Soma cube”s out of wood. I don’t know how many of these I made back in grade school, but a few years ago I started doing it again.

Then about this time last year, I wondered what it would be like to write a program to solve the puzzle. This page has links to the source code for a solver written in C++; I couldn’t run it because I don’t do Windows. I soon decided to keep the data representation but otherwise write a new solver in Python.

The solver places the “T” piece in the only way possible (as explained here in the wikipedia article), then places the “L” piece in one of 28 unique positions. Here “unique” means never having to say “that’s a reflection of a previously-described solution”; ask me for details or see the code if you’re hungry for details. Anyway, the other pieces—V, Z, A, B, P (wikipedia’s nomenclature)— get placed after T and L.

Consulting a recent output, I discovered that the “V” has 103 possible placements (by “possible” I mean given the position of “T”), whereas the “P” has only 42. For some reason I had the intuition that placing the “P” first would use less CPU time.

I ended up coding a sort step so that the solver would place the “hardest” piece (the “P”) first, and then the pieces with increasing numbers of possible placements, and finally the “V” last. How much of a difference does it make? I don’t recall, but I’ll just try it now on a 2.4GHz Intel Q6600:

$ time ./soma2-unsorted.py > s2-unsorted.out

real 3m46.587s
user 3m46.564s
sys 0m0.004s
$ time ./soma2.py > s2-sorted.out

real 0m38.732s
user 0m38.720s
sys 0m0.008s
$

That’s 226.6 seconds vs. 38.7 seconds, a nontrivial difference.

Fast-forward to earlier this month. I was thinking again about wooden puzzles and wikipedia told me about the “diabolical cube”, with only 13 solutions, and it’s here that life imitates code.

Because of my experience with the Soma cube solver, I immediately started by placing the “hard”est-to-place pieces first, and then try to place the easier ones. By proceeding that way, I figured out the first several solutions. But to get the rest of them, I modified the Soma cube solver… after a few tries it worked, completing in about a second. Here’s an actual run on the same box as above:

$ time ./diacube.py  > d.out

real 0m0.978s
user 0m0.976s
sys 0m0.000s
$

The 7-cube piece must be placed parallel to a face of the desired 3x3 cube; it can be

actually on a face—in this case, the rest of the pieces determine a unique solution; or
between two faces—in this case, it’s possible to create solutions that are reflections of each other. I could have avoided this by coding for placement of the second piece (I would have done this to the 6-cube piece) but instead opted to write a little code to keep only one solution out of a pair of reflections.

There wouldn’t be many of these, because my program produced 17 solutions, whereas Wikipedia told me there were only 13. So it computed only 4 redundant solutions, and it would take longer to think about unique placements than it would be to write the code to find reflections.

Or so I thought. I goofed up the code that looked for reflections, because I had forgotten that my solutions contained references to each polycube’s placement. In computing a reflection, I trashed the original polycube. A rookie error.

Anyway, if you ever get your hands on a “Diabolical cube,” it’ll be a lot easier to solve if you place the 7-cube piece and the 6-cube piece and the 5-cube piece first.

For reference, I’ll give you the 13 solutions, after some blank space if you want to avoid the spoiler :)

solution 1:

6	6	6
4	4	2
4	4	2

7	6	6
7	7	7
7	7	7

3	3	6
5	3	5
5	5	5

solution 2:

6	6	6
4	4	2
4	4	2

7	6	6
7	7	7
7	7	7

5	5	6
5	3	3
5	5	3

solution 3:

6	6	6
2	4	4
2	4	4

7	6	6
7	7	7
7	7	7

3	3	6
5	3	5
5	5	5

solution 4:

6	6	6
2	4	4
2	4	4

7	6	6
7	7	7
7	7	7

5	5	6
5	3	3
5	5	3

solution 5:

6	5	5
6	6	3
6	6	6

4	4	5
4	4	3
2	2	3

7	5	5
7	7	7
7	7	7

solution 6:

6	5	5
6	6	3
6	6	6

2	2	5
4	4	3
4	4	3

7	5	5
7	7	7
7	7	7

solution 7:

4	6	2
4	6	2
5	6	5

4	6	3
4	6	3
5	5	5

7	6	3
7	7	7
7	7	7

solution 8:

4	6	3
4	6	3
5	6	5

4	6	2
4	6	3
5	5	5

7	6	2
7	7	7
7	7	7

solution 9:

4	5	5
4	3	2
3	3	2

4	5	6
4	6	6
6	6	6

7	5	5
7	7	7
7	7	7

solution 10:

4	5	5
4	3	3
2	2	3

4	5	6
4	6	6
6	6	6

7	5	5
7	7	7
7	7	7

solution 11:

3	5	5
4	4	2
4	4	2

3	5	6
3	6	6
6	6	6

7	5	5
7	7	7
7	7	7

solution 12:

2	5	5
3	4	4
3	4	4

2	5	6
3	6	6
6	6	6

7	5	5
7	7	7
7	7	7

solution 13:

3	5	5
2	4	4
2	4	4

3	5	6
3	6	6
6	6	6

7	5	5
7	7	7
7	7	7