geek

convmv solves rsync encoding problem

in

I was using rsync to copy files from a big external backup drive to an even bigger external backup drive I just got. (It's a 500 gig Seagate FreeAgent Go, which requires only the USB connection for power.) I got a bunch of these errors:

rsync: recv_generator: failed to stat 
"/path/Rough-Guide-to-Tango/15-N\#351stor-Marconi-Tr\#355o---Cuando-T\#372-No-Est\#341s.mp3": 
Invalid or incomplete multibyte or wide character (84)

I installed convmv ("converts filenames from one encoding to another"), ran convmv -f latin1 -t utf-8 *, ran it again with --notest, and then rsync worked without any trouble. Nice!

Four ways of using Amazon S3

in

I've been thinking about how I might change how I do backups. I set up an account at Amazon Web Services so I could use S3, their online storage system. Since I already have an Amazon account it was trivial to get started.

I'm doing all this on my new Ubuntu machine. Along the way I had to install a bunch of new libraries and packages, and I didn't keep track. I'll mention what I remember. Sorry about that. Oh, and in the examples below I sometimes add some comments with a #. For my examples I use test-file.mp3 and file.jpg, two copies of the same file, which is why they have identical byte sizes.

s3cmd

s3cmd is part of s3tools. I installed it from the s3cmd package and then ran s3cmd --configure to set it up. I didn't tell it to do encryption and it failed a test when I told it to use HTTPS, so I turned that off. It made a .s3cfg file and then I was all set:
$ s3cmd mb s3://wdenton.example
# Two Python warnings skipped in all these quotes
Bucket 'wdenton.example' created
$ s3cmd put test-file.mp3 s3://wdenton.example/
File 'test-file.mp3' stored as s3://wdenton.example/test-file.mp3 (1613952 bytes in 23.6 seconds, 66.84 kB/s) [1 of 1]
$ s3cmd ls
2009-06-28 01:06  s3://wdenton.example
$ s3cmd ls s3://wdenton.example/
Bucket 's3://wdenton.example':
2009-06-28 01:10   1613952   s3://wdenton.example/test-file.mp3
# You can make folders, except they're fake
$ s3cmd put test-file.mp3 s3://wdenton.example/dir/test-file.mp3
File 'test-file.mp3' stored as s3://wdenton.example/dir/test-file.mp3 (1613952 bytes in 19.0 seconds, 82.98 kB/s) [1 of 1]
$ s3cmd ls s3://wdenton.example/
Bucket 's3://wdenton.example':
2009-06-28 02:31   1613952   s3://wdenton.example/dir/test-file.mp3
2009-06-28 01:10   1613952   s3://wdenton.example/test-file.mp3

s3fs

The s3fs project has a page called FuseOverAmazon that explains how to mount an S3 bucket as a file system. (Fuse stands for Filesystem in Userspace.) To get this working I think I installed the libfuse2, libfuse-dev, and fuse-utils packages (or at least they're installed now and everything works), and g++ so I had a C++ compiler. You have to compile this thing by hand, but just run make and move the binary somewhere useful.

There are differences between how s3fs and other things handle "directories" in S3 buckets. In another test where I used s3cmd sync I couldn't see any of the files with s3fs, but in this example I seem to have made it work, sort of.

$ s3fs wdenton.example -o accessKeyId=my_access_key -o secretAccessKey=my_secret_key mnt
$ cd mnt
$ ls -l
# Notice that it doesn't see the dir "directory" I made above
total 1577
---------- 1 root root 1613952 2009-06-27 21:10 test-file.mp3
$ cp ~/file.jpg .
$ ls -l
total 3153
-rw-r--r-- 1 buff buff 1613952 2009-06-27 22:50 file.jpg
---------- 1 root root 1613952 2009-06-27 21:10 test-file.mp3
$ mkdir dir
$ mv file.jpg dir/
$ ls -l
total 1577
drwxr-xr-x 1 buff buff       0 2009-06-27 22:51 dir
---------- 1 root root 1613952 2009-06-27 21:10 test-file.mp3
$ cd dir
$ ls -l
# But after I made dir here, it saw the file I'd put there earlier
total 3153
-rw-r--r-- 1 buff buff 1613952 2009-06-27 22:50 file.jpg
---------- 1 root root 1613952 2009-06-27 22:31 test-file.mp3
$ cd 
$ fusermount -u mnt/

How to s3fs on EC2 Ubuntu has some helpful information.

Duplicity

duplicity gives "encrypted bandwidth-efficient backup using the rsync algorithm." I used rsync and rdiff-backup to synchronize files, and this is the same kind of thing. I installed it from the package.

This environment variable stuff isn't documented in the man page, but it's explained here: Duplicity + Amazon S3 = incremental encrypted remote backup.

$ mkdir test
$ mv file.jpg test-file.mp3 test/
$ export AWS_ACCESS_KEY_ID=my_access_key
$ export AWS_SECRET_ACCESS_KEY=my_secret_key
$ duplicity test/ s3+http://wdenton.example/test/
GnuPG passphrase: [I enter a phrase]
No signatures found, switching to full backup.
Retype passphrase to confirm: [I reenter the phrase]
--------------[ Backup Statistics ]--------------
StartTime 1246158681.12 (Sat Jun 27 23:11:21 2009)
EndTime 1246158681.53 (Sat Jun 27 23:11:21 2009)
ElapsedTime 0.41 (0.41 seconds)
SourceFiles 0
SourceFileSize 3232000 (3.08 MB)
NewFiles 0
NewFileSize 0 (0 bytes)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 0
RawDeltaSize 3227904 (3.08 MB)
TotalDestinationSizeChange 2956111 (2.82 MB)
Errors 0
-------------------------------------------------
# Let's see what's in the test/ directory up there
$ s3cmd ls s3://wdenton.example/test/
Bucket 's3://wdenton.example':
2009-06-28 03:12     76249   s3://wdenton.example/test/duplicity-full-signatures.2009-06-27T23:11:12-04:00.sigtar.gpg
2009-06-28 03:12       189   s3://wdenton.example/test/duplicity-full.2009-06-27T23:11:12-04:00.manifest.gpg
2009-06-28 03:11   2955922   s3://wdenton.example/test/duplicity-full.2009-06-27T23:11:12-04:00.vol1.difftar.gpg
# Now let's do a test restore
$ cd /tmp
$ duplicity s3+http://wdenton.example/test/ test/
GnuPG passphrase: [I enter the phrase]
$ cd test/
$ ll
total 3168
-rw-r--r-- 1 buff buff 1613952 2009-06-27 22:28 file.jpg
-rw-r--r-- 1 buff buff 1613952 2009-06-27 21:10 test-file.mp3

S3Fox Organizer

S3Fox is the simplest way to get into your S3 space: it's a Firefox extension and lets you drag and drop files. Here's a screenshot:

Screenshot of S3Fox (small version)

Something about the dir/ directory I made with s3fs doesn't work here, but the test/ directory I made with duplicity does. s3fs doesn't see the test/ directory. It seems like if you go with s3fs you pretty much have to stick with it, but these other tools are work nicely together.

Conclusion

If I do start using this for backups, I'll use Duplicity and do manual stuff with s3cmd and S3Fox.

How I use Emacs for Getting Things Done

I use the Getting Things Done system to keep track of what I'm doing. It works very well for me. My personal stuff I keep track of in paper in a Filofax, but I have a lot more detail to track at work at York University, so I use text files. Here's my system.

The files

I have three files to manage what I'm doing, plus a monthly work diary:

  • next-actions.outline.txt
  • waiting-for.outline.txt
  • projects.outline.txt
  • work-diary-200904.outline.txt, in which I jot down notes about what I did that day and what's on my mind.

(I'll explain about Emacs and outline mode below.)

git to manage the files

I use the distributed version control system git to manage these files. First, I set up a basic repository on a Unix host where I do my personal e-mail.

$ mkdir -p york/gtd
$ cd york/gtd
$ git init
Initialized empty Git repository in /home/buff/york/gtd/.git/

Here I edited next-actions.outline.txt. More on its format below. Right now it only matters that the file exists.

$ git status
# On branch master
#
# Initial commit
#
# Untracked files:
#   (use "git add ..." to include in what will be committed)
#
#       next-actions.outline.txt
nothing added to commit but untracked files present (use "git add" to
track)

$ git add next-actions.outline.txt
$ git commit -m 'getting started'
[master (root-commit)]: created e98f811: "getting started"
 1 files changed, 4 insertions(+), 0 deletions(-)
 create mode 100644 next-actions.outline.txt
$ git log
commit e98f811b6b67ffd354ff33ef5df3da872a8e7059  
Author: William Denton 
Date:   Tue Apr 21 21:12:44 2009 -0400

    getting started

Now I have a git repository with one file sitting on a Unix server I can get to from anywhere: work, home, anywhere with an Internet connection. I make a copy of it on my home machine:

$ cd york
$ git clone picketfence:york/gtd/
Initialized empty Git repository in /home/buff/york/gtd/.git/
remote: Counting objects: 3, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (3/3), 268 bytes, done.

I already had a york directory where I kept stuff. I was able to specify the server and file path with just picketfence:york/gtd because I already have ssh set up to save me time in ~/.ssh/config:

Host picketfence
Hostname picketfence.server.com
User buff

This lets me just say ssh picketfence and I connect. I've got my keys set up so no password is required, either.

Back to cloning a local copy of the repository.

$ cd gtd
$ ls -l 
total 2
-rw-r--r--  1 buff  wheel  44 21 Apr 21:13 next-actions.outline.txt

Here I can edit next-actions.outline.txt again and add a task. When I'm done, I do this:

$ git commit -m 'update' next-actions.outline.txt
[master]: created 15f2969: "update"
 1 files changed, 1 insertions(+), 0 deletions(-)
$ git status
# On branch master
# Your branch is ahead of 'origin/master' by 1 commit.
#
nothing to commit (working directory clean)
$ git push
Counting objects: 5, done.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 326 bytes, done. 
Total 3 (delta 0), reused 0 (delta 0)
warning: updating the currently checked out branch; this may cause
confusion,
as the index and working tree do not reflect changes that are now in HEAD.
To picketfence:york/gtd
   e98f811..15f2969  master -> master

The next day at work I did the same thing and made a local copy of the repository there. I made my lists and so on, and at the end of the day I committed all of the files. At home in the evening, I ran git pull and it downloaded all of the changes. I could edit them, do a git push, and then the next day do another git pull first thing at work.

Now I have an easy way of keeping my GTD files in synch across various machines. They're not in the cloud so I can work on them without Internet access.

Emacs and outline mode

I'm a Unix-loving geek, so of course I keep my text in text files. To manage the GTD files I settled on outline mode, which is built into Emacs.

That page explains what an outline mode is. Here's an example of mine:

* E-mail
** Catherine: accurate collection stats for Wikipedia entry
** LCC: will be away for next meeting
** Peter R: is Joomla in use anywhere at York?
If so, could I get a test account to see what it's like?

To prevent having to run M-x outline-mode every time I open one of these files, I added this to .emacs:

;; outline-mode
(add-to-list 'auto-mode-alist '("\\.outline\\.txt\\'" . outline-mode))
(add-hook 'outline-mode-hook 'hide-body)

Now when I open next-actions.outline.txt Emacs automatically goes into outline mode and hides the bodies of all the entries so I just see the tasks.

It works for me

This system works really well for me. When I'm jotting down notes on what I did that day, if I remember I need to do something (e-mail someone, read something, fix something, whatever) I can switch buffers to the next actions list and put it down there. If I have a new project on the go, and I copy notes to the projects lists. If I have a next action of e-mailing someone, when I've e-mailed them I just copy the line to the waiting for list and add the date I sent the e-mail.

Plain text, Emacs to edit it, outline mode to give it some structure, and git so I always have a current copy of the files I need. I'm really happy with this.

Syndicate content