Advertisement

Customize

Coding Emacs's M-x in Lisp

Sep. 8th, 2009 | 02:24 am

I have always wondered what terms stipulate whether an Emacs internal routine is written in Emacs Lisp or if it exists as a built-in and part of Emacs's C source code. I have grown accustomed to how most core Emacs commands and functions, no matter how small, are written in Emacs Lisp. There is no doubt a lot of Emacs that has to be written in C.

I recently wanted to change how `M-x' works. The command behind it is called `execute-extended-command'. It is written in C. This is disappointing for my desires to tinker, but not all together surprising either. It is a pretty central piece of the Emacs infrastructure.

To extend the `M-x' command, I didn't want to jump into having to write C code. Although, it was a chance to practice compiling Emacs, and potentially debugging my extension to the command. I wanted to avoid this development cycle, so I tried translating the C code into Emacs Lisp. Given how cleanly written the Emacs sources are, it was not a very difficult task. For extra cuteness, I even carried over the C comments.

Use at your own risk. I would be interested in feedback on how this works for people, but redefining a central command like this could potentially create a lot of problems for your Emacs. With that out of the way, I have used this for a few weeks without a problem.

The block of C code for defining Fexecute_extended_command hasn't changed significantly over the last ten years, so the Emacs Lisp version I present below should work as a replacement with the last two major releases of Emacs -- versions 22 and 23.

;; Based on Fexecute_extended_command in keyboard.c of Emacs.
;; Aaron S. Hawley <aaron.s.hawley(at)gmail.com> 2009-08-24
(defun execute-extended-command (prefixarg)
  "Read function name, then read its arguments and call it.

To pass a numeric argument to the command you are invoking with, specify
the numeric argument to this command.

Noninteractively, the argument PREFIXARG is the prefix argument to
give to the command you invoke, if it asks for an argument."
  (interactive "P")
  ;; The call to completing-read wil start and cancel the hourglass,
  ;; but if the hourglass was already scheduled, this means that no
  ;; hourglass will be shown for the actual M-x command itself.
  ;; So we restart it if it is already scheduled.  Note that checking
  ;; hourglass_shown_p is not enough,  normally the hourglass is not shown,
  ;; just scheduled to be shown.
  (let* ((hstarted (and (symbolp window-system)>
                        (eq void-text-area-pointer 'hourglass)))
         (saved-keys (this-command-keys-vector)) ;; ?\M-x
         (buf (concat 
               (cond ((eq prefixarg '-)
                      "- ")
                     ((and (consp prefixarg)
                           (= (car prefixarg) 4))
                      "C-u ")
                     ((and (consp prefixarg)
                           (integerp (car prefixarg)))
                      (format "%d " (car prefixarg)))
                     ((integerp prefixarg)
                      (format "%d " prefixarg))
                     (t ""))
               "M-x "))
         (function (completing-read buf obarray 'commandp t nil
                                    'extended-command-history)))
    (if hstarted (setq void-text-area-pointer 'hourglass))
    (if (and (stringp function)
             (= (length function) 0))
        (error "No command name given")
      ;; Set this_command_keys to the concatenation of saved-keys and
      ;; function, followed by a RET.
      (setq saved-keys (vconcat saved-keys
                                function
                                [return]))
      (setq function (intern function)))
    (setq prefix-arg prefixarg)
    (setq this-command function)
    (command-execute function 'record saved-keys)
    ;; If enabled, show which key runs this command.
    (if (and (not (null suggest-key-bindings))
             (null executing-kbd-macro))
        ;; If the command has a key binding, print it now.
        (let ((bindings (where-is-internal function
                                           overriding-local-map t))
              (waited))
          ;; But first wait, and skip the message if there is input.
          (if (and (not (null bindings))
                   (not (vectorp bindings))
                   (eq (aref bindings 0) 'mouse-movement))
              ;; If this command displayed something in the echo area;
              ;; wait a few seconds, then display our suggestion message.
              (if (null (current-message))
                  (setq waited (sit-for 0))
                (if (numberp suggest-key-bindings)
                    (setq waited (sit-for suggest-key-bindings))
                  (setq waited (sit-for 2)))))
          (if (and (null waited)
                   (consp unread-command-events))
              (with-temp-message (current-message)
                (let ((binding (key-description bindings)))
                  (message "You can run the command `%s' with %s"
                           function binding)
                  (if (numberp suggest-key-bindings)
                      (setq waited (sit-for suggest-key-bindings))
                    (setq waited (sit-for 2))))))))))

The result is half as many lines of code compared to the C. But the real benefit is being able to mess around with it. If this version that mimics the standard behavior is proved to be stable, I may try and see if a few customizations to these new bits will hold up as well.

I currently put the above code in a file called m-x.el, then put the file in my load-path and add the following to my .emacs file.

  (load "m-x")

Link | Leave a comment {9} | Add to Memories | Tell a Friend

Feed: Alex Schroeder

Aug. 23rd, 2009 | 02:46 pm

[info]Alex Schroeder is the author of the Oddmuse Wiki engine and various Emacs modes, including SQL mode and ANSI color. He is also responsible for maintaining the Emacs Wiki for the emacsen community.

Alex maintains two other sites. Community Wiki discusses online culture, and the Campaign Wiki organizes materials on role playing games (RPG).

LiveJournal users can sign up for updates from Alex's blog at [info]kensanata_diary and adding the feed to their friends list.

Link | Leave a comment | Add to Memories | Tell a Friend

Feed: Sacha Chua

Jul. 11th, 2009 | 02:42 am

Sacha Chua works for Global Business Services at IBM, and was previously an IBM Enterprise 2.0 consultant. She is also a speaker, a self-declared "technology evangelist", application developer, Web 2.0 expert and "geek".

A prolific writer who can posts multiple times daily with extremely technical articles about Drupal or Emacs, some analysis on working and self-organization, simple reflections on hobbies and personal life, and sometimes short story fiction.

LiveJournal users can sign up for updates from Chua's blog at [info]sachachuawiki and adding the feed to their friends list.

Link | Leave a comment | Add to Memories | Tell a Friend

Feed: kfogel rants

Jul. 7th, 2009 | 10:25 am

Karl Fogel works for Canonical, the English and South African company behind Ubuntu, and is well-known for his contributions to CVS and is an author of Subversion. He wrote the CVS book and Producing Open Source Software: How to Run a Successful Free Software Project.

His blog posts include technical and political issues related to free software and copyright reform, but random musings also show up.

LiveJournal users can sign up for updates from Fogel's blog at [info]kfogelrants and adding the feed to their friends list.

Link | Leave a comment | Add to Memories | Tell a Friend

Feed: SFLC Tech Blog

Jul. 6th, 2009 | 08:19 am

The Tech blog at the Software Freedom Law Center (SFLC) features Bradley M. Kuhn, former (and first ever) director of the Free Software Foundation. Kuhn is now the Policy Analyst and Technology Director for the SFLC.

His articles are about licensing and legal issues around free software, but he also gives reviews on software and gives updates on the computing infrastructure for the SFLC of which he manages.

LiveJournal users can sign up for updates from Kuhn's blog at [info]sflc_tech and adding the feed to their friends list.

Link | Leave a comment | Add to Memories | Tell a Friend

Unmasking passwords

Jul. 4th, 2009 | 11:32 pm

I have to say that I fully endorse showing passwords in the clear as a user interface rule. This is a big break with traditional masking passwords of with dots or asterisks or without echoing anything, so it is obviously quite controversial for folks. However, clear passwords avoid usability problems with typing passwords. It requires people to actually make sure their neighbors are looking the other way when they are logging in or authenticating. It also could make long complex passwords -- passphrases -- more likely for people to use. Nobody should be looking in the direction of a keyboard while typing a password, clear passwords would require this as etiquette and induce a social norm.

The other day, security expert [info]bruce_schneier wrote a follow-up article to his own on The Pros and Cons of Password Masking. He was back pedaling from his previous post The Problem with Password Masking that criticized masking of passwords. He was concurring with same opinion put forth on [info]alertbox by Jakob Nielsen.

In the outrage that followed, there seemed to be a suggestion that this is bad, especially in the case where people are using a projector while logging in to a system. This is a weak argument. There are off-buttons for projectors others even have "black screen" buttons, so simply use them. False trust in password maskers can result in people typing their password in the clear during a presentation when the presenter was confused where their cursor's focus was. I've seen this happen.
Tags:

Link | Leave a comment | Add to Memories | Tell a Friend

Iran is not a twitter revolution

Jun. 29th, 2009 | 05:07 pm

Reese Erlich is a freelance journalist and author who's been covering recent events in Iran -- *from Iran* and not just his computer chair like many in the mainstream media have.

He recently countered "left-wing Doubting Thomas arguments" in an article on Common Dreams .org. In his arguments, I found these observations about Iranians "fighting for political, social and economic justice" inspiring.

[...]

Assertion: The U.S. has a long history of meddling in Iran, so it must be behind the current unrest.

[...]

Frankly, based on my observations, no one was leading the demonstrations. During the course of the week after the elections, the mass movement evolved from one protesting vote fraud into one calling for much broader freedoms. You could see it in the changing composition of the marches. There were not only upper middle class kids in tight jeans and designer sun glasses. There were growing numbers of workers and women in very conservative chadors.

Iranian youth particularly resented President Ahmadinejad's support for religious militia attacks on unmarried young men and women walking together and against women not covering enough hair with their hijab. Workers resented the 24 percent annual inflation that robbed them of real wage increases. Independent trade unionists were fighting for decent wages and for the right to organize.

Some demonstrators wanted a more moderate Islamic government. Others advocated a separation of mosque and state, and a return to parliamentary democracy they had before the 1953 coup. But virtually everyone believes that Iran has the right to develop nuclear power, including enriching uranium. Iranians support the Palestinians in their fight against Israeli occupation, and they want to see the U.S. get out of Iraq.

So if the [sic] CIA was manipulating the demonstrators, it was doing a piss poor job.

[...]

See also Erlich's Iran is not a twitter revolution.

Link | Leave a comment | Add to Memories | Tell a Friend

How the culture is hostile to women

Jun. 26th, 2009 | 10:57 am

I've written before how I believe that a lot of women aren't recruited or drawn into all levels of computing because the culture is male-centered and therefore not attractive to women. I also believe a symptom of this disease -- that also only adds to make the situation even worse -- are the outright sexist and misogynistic acts by men. Often, such harassment is acted out in private or in electronic forums -- these incidents are well-documented. However, occasionally this hostility bubbles up and boils over into public and in-person situations.

This month, there was a presentation at a Flash conference that sounded completely absurd. Read about it at Prude or Professional? by Courtney Remes at the [info]geekgirlsguide.

Earlier this year there was a similar presentation at a Ruby conference. See Why Rails is Still Ghetto by [info]sarahmei and gender and sex at gogaruco by [info]Sarah Allen for reports from a third of the women who attended the conference (2 out of 6 is one-third) and found the presentation offensive.

On all this, I prefer to just quote some of what [info]volsunga wrote in Porn. Ruby. *headdesk* rather than adding more meta-discussion to these controversies. It is stated well.

[...] This is the classic "it's more offensive for you to say I'm a sexist than for me to actually be sexist!" response. People with an agenda (usually those sneaky feminists) choose to find something offensive so they can have a whine and call someone mean names, like "sexist". But what's at stake here isn't that the presentation was offensive per se, but that the context was inappropriate and potentially alienating to women developers, in an environment that's already default male by dint of numbers.

There's also the classic "you could just ignore it if you don't like it" defence. [...]

This presumes that people who don't like pictures of naked women went along just so they could complain. But even if everyone who thought they might not like the talk didn't go, it'll still be wrong to show it; the very presence of such a slideshow at the event creates an atmosphere where women are "them", where some content is made solely for men, but as if "male" is "default". [...]

And it doesn't matter if it was intentional -- no one really thinks [the presenter] sat down and schemed to offend women in advance -- and by refocusing on intention [the presenter] is able to get away with all that "poor little me" stuff in his post, as if his whole character has been impugned.

Newsflash: there's a difference between saying "you're a sexist/racist/homophobe" and "some of the stuff you just did/said contributed to the sexist/racist/homophobic culture around X".

Message to Ruby developers who think this is out of control/proportion/just a bit silly: all your rights to nod sympathetically/join in when someone bemoans the lack of women developers are entirely removed (for ever) if when women do speak up, you pull this self-pitying, I'm-a-nice-guy-really, its-not-my-fault, thats-just-the-way-I-roll, stop-complaining bullshit. And if those who complained then get painted as moralistic, shrill and angry for the sake of it.

There are various posts up and around about why this has become a blame game, and that it's counter-productive. It wouldn't be a blame game if there had been less bombastic denial and more listening on the part of the speaker in the first place. Blame games stop when someone puts their hands up and scrutinises their behaviour. So get on with it.

A fun resource I found while poking around in this is Derailing for Dummies.

Link | Leave a comment {1} | Add to Memories | Tell a Friend

Drivable Motor Vehicle Act

Jun. 24th, 2009 | 03:44 pm

I noticed an old comment posted to someone's site by [info]Matthew Davidson who wrote a car metaphor for free software. This is a commonly used metaphor device, but I think it's especially good. (I *swear* I didn't find this in a search for my own surname).


I drive a little Toyota hatchback. I do so because I got it relatively cheap from my sister-in-law. This was possible because she owned the car, [instead of having] a non-transferable license to use the car under certain conditions.

I know how to pump up the tyres and refill the thing that squirts water on the windscreen. That's all I know about maintaining the vehicle, and probably all I ever will know. I take it to the local mechanic of my choice every couple of years and he fustigates the Smoot-Hawley flanges or whatever for me, at what I can only assume is a reasonable price.

I am very glad that the bonnet was not locked shut at the factory by Toyota, and that there is not a Drivable Motor Vehicle Act (DMVA) to make it a criminal offense for anybody to attempt to service their own car, or pay somebody other than the manufacturer to service it. I may not personally know the first thing about its [sic] inner workings, but if I suspect I'm being charged to much for some work on my car, I can go a few hundred metres up the road to the next mechanic who can provide me with a quote.

Most of these mechanics probably chose this trade after opening up the bonnet of their own car and having a playful poke around, the same way I learned how to program computers. Now as Richard Stallman would say, the ethical issues around car manufacturing and software manufacturing are not the same; I don't have the legal right to make a perfect copy of my car, but that's okay because I don't have the practical means to do so -- no matter how much technical skill I am able to acquire, and neither does anybody but very large corporations, so losing that freedom (through patents) doesn't cost me anything, while potentially delivering the benefits to society that the patent system [for car manufacturing] is supposed to provide.

But if somebody paid me to write some software for them and I said "okay, I'll write it for you, but only under the condition that you don't copy it or attempt to fix or improve it yourself, or pay somebody else to fix or improve it," that would be a very bad deal for the customer, because the means to do these things are so cheap that you are practically only paying for the time of the person who does the work (or not, if you do it yourself). It would be such a bad deal in fact, that if I managed to convince a sucker to fall for it, I would have to regard my own behaviour as unethical.

Granted there aren't as many programmers as motor vehicle mechanics in my town, but that can and should change. Already I can point to half a dozen people I know who could (and hopefully will) become as familiar with the inner workings of [the free software package] Drupal as myself with only a little effort. As this begins to happen across a wide range of software the real cost of proprietary software (as opposed to the mere price tag), and the benefits of freedom, will become apparent to even the most non-technical users.


Speaking of "tryes", "metres" and "bonnets"; the end software patents campaign needs help documenting the patent issue in Australia and New Zealand, among other locales.

Link | Leave a comment {2} | Add to Memories | Tell a Friend

Release of dump package

Jun. 18th, 2009 | 01:49 pm

The dump/restore package version 0.4b42 is being released today. It is the first release in over 3 years. Read the release notes.

It supports versions two through four of the Linux kernel's extended file system (ext2, ext3, ext4). Technically, it is considered beta software, but I have reason to believe there are a lot of people who use it in production systems. You can also use it for disk-based backups. You don't need to have a tape jukebox to use it, although it is designed for use with tape backups.

Link | Leave a comment | Add to Memories | Tell a Friend

Sorting UTF-8 strings in PHP

May. 28th, 2009 | 03:23 pm

With Unicode characters, in this case the popular UTF-8, sometimes you need to convert characters to ASCII to get things done in PHP. In the case of sorting Unicode, there are the existing solutions of collator_sort() for PHP5 and strcoll() since PHP4. However, they both assume a locale. A hack that is locale-agnostic would just "normalize" Unicode characters to ASCII.

This is far from complete, but seems to do the right thing.

    <?php

    /**
     * Normalize international characters for purposes like sorting and
     * searching by using a heuristic that just uses ASCII--the english
     * alphabet ordering--for a multilingual solution--no locale setting.
     */
    header("Content-type: text/plain; charset=utf-8");

    /**
     * Iñtërnâtiônàlizætiøn
     *
     * Example from Sam Ruby
     * http://intertwingly.net/stories/2004/04/14/i18n.html
     * 
     * By way of WACT team
     * http://www.phpwact.org/php/i18n/charsets
     */
    $internationalization = array(
				  "I", // I
                                  "\xC3\xB1", // ñ
                                  "t", // t
                                  "\xC3\xAB", // ë
                                  "r", // r
                                  "n", // n
                                  "\xC3\xA2", // â
                                  "t", // t
                                  "i", // i
                                  "\xC3\xB4", // ô
                                  "n", // n
                                  "\xC3\xA0", // à
                                  "l", // l
                                  "i", // i
                                  "z", // z
                                  "\xC3\xA6", // æ
                                  "t", // t
                                  "i", // i
                                  "\xC3\xB8", // ø
                                  "n"); // n
    
    /** 
     * Use strtr() with this dictionary to convert to ASCII.
     * This data structure is not comprehensive.
     */
    $utf8_dict = array("\xC3\x80" => "A", // À
                       "\xC3\x81" => "A", // Á
                       "\xC3\x82" => "A", // Â
                       "\xC3\x83" => "A", // Ã
                       "\xC3\x84" => "A", // Ä
                       "\xC3\x85" => "A", // Å
                       "\xC3\x86" => "A", // Æ
                       "\xC3\x9E" => "B", // Þ
                       "\xC3\x87" => "C", // Ç
                       "\xC4\x86" => "C", // Ć
                       "\xC4\x8C" => "C", // Č
                       "\xC4\x90" => "Dj", // Đ
                       "\xC3\x88" => "E", // È
                       "\xC3\x89" => "E", // É
                       "\xC3\x8A" => "E", // Ê
                       "\xC3\x8B" => "E", // Ë
                       "\xC4\x9E" => "G", // Ğ
                       "\xC3\x8C" => "I", // Ì
                       "\xC3\x8D" => "I", // Í
                       "\xC3\x8E" => "I", // Î
                       "\xC3\x8F" => "I", // Ï
                       "\xC4\xB0" => "I", // İ
                       "\xC3\x91" => "N", // Ñ
                       "\xC3\x92" => "O", // Ò
                       "\xC3\x93" => "O", // Ó
                       "\xC3\x94" => "O", // Ô
                       "\xC3\x95" => "O", // Õ
                       "\xC3\x96" => "O", // Ö
                       "\xC3\x98" => "O", // Ø
                       "\xC3\x9F" => "Ss", // ß
                       "\xC3\x99" => "U", // Ù
                       "\xC3\x9A" => "U", // Ú
                       "\xC3\x9B" => "U", // Û
                       "\xC3\x9C" => "U", // Ü
                       "\xC3\x9D" => "Y", // Ý
                       "\xC3\xA0" => "a", // à
                       "\xC3\xA1" => "a", // á
                       "\xC3\xA2" => "a", // â
                       "\xC3\xA3" => "a", // ã
                       "\xC3\xA4" => "a", // ä
                       "\xC3\xA5" => "a", // å
                       "\xC3\xA6" => "a", // æ
                       "\xC3\xBE" => "b", // þ
                       "\xC3\xA7" => "c", // ç
                       "\xC4\x87" => "c", // ć
                       "\xC4\x8D" => "c", // č
                       "\xC4\x91" => "dj", // đ
                       "\xC3\xA8" => "e", // è
                       "\xC3\xA9" => "e", // é
                       "\xC3\xAA" => "e", // ê
                       "\xC3\xAB" => "e", // ë
                       "\xC3\xAC" => "i", // ì
                       "\xC3\xAD" => "i", // í
                       "\xC3\xAE" => "i", // î
                       "\xC3\xAF" => "i", // ï
                       "\xC3\xB0" => "o", // ð
                       "\xC3\xB1" => "n", // ñ
                       "\xC3\xB2" => "o", // ò
                       "\xC3\xB3" => "o", // ó
                       "\xC3\xB4" => "o", // ô
                       "\xC3\xB5" => "o", // õ
                       "\xC3\xB6" => "o", // ö
                       "\xC3\xB8" => "o", // ø
                       "\xC5\x94" => "R", // Ŕ
                       "\xC5\x95" => "r", // ŕ
                       "\xC5\xA0" => "S", // Š
                       "\xC5\x9E" => "S", // Ş
                       "\xC5\xA1" => "s", // š
                       "\xC3\xB9" => "u", // ù
                       "\xC3\xBA" => "u", // ú
                       "\xC3\xBB" => "u", // û
                       "\xC3\xBC" => "u", // ü
                       "\xC3\xBD" => "y", // ý
                       "\xC3\xBD" => "y", // ý
                       "\xC3\xBF" => "y", // ÿ
                       "\xC5\xBD" => "Z", // Ž
                       "\xC5\xBE" => "z"); // ž
    
    $i18n = join("", $internationalization);
    print $i18n . "\n";

    /**
     * UTF-8 regular expression from
     * http://php.net/manual/en/function.utf8-decode.php (comment 57069)
     */
    $utf8_re = "/^([\\x00-\\x7f]|"
      . "[\\xc2-\\xdf][\\x80-\\xbf]|"
      . "\\xe0[\\xa0-\\xbf][\\x80-\\xbf]|"
      . "[\\xe1-\\xec][\\x80-\\xbf]{2}|"
      . "\\xed[\\x80-\\x9f][\\x80-\\xbf]|"
      . "\\xef[\\x80-\\xbf][\\x80-\\xbc]|"
      . "\\xee[\\x80-\\xbf]{2}|"
      . "\\xf0[\\x90-\\xbf][\\x80-\\xbf]{2}|"
      . "[\\xf1-\\xf3][\\x80-\\xbf]{3}|"
      . "\\xf4[\\x80-\\x8f][\\x80-\\xbf]{2})*$/";

    print "Valid UTF-8?: " . (preg_match($utf8_re, $i18n) > 0
			      ? "true" : "false") . "\n";

    print strtr($i18n, $utf8_dict) . "\n";

    // Doesn't work in PHP4?
    $sorted = preg_split("//u", $i18n, -1, PREG_SPLIT_NO_EMPTY);
    // So, just use the original array, instead.
    $sorted = $internationalization;

    function compare($s1, $s2)
    {
      global $utf8_dict;
      return strcasecmp(strtr($s1, $utf8_dict),
			strtr($s2, $utf8_dict));
    }

    usort($sorted, "compare");
    print join("", $sorted) . "\n";

    /**
     * Results:
     * 
     * Iñtërnâtiônàlizætiøn
     * Valid UTF-8?: true
     * Internationalization
     * àæâëIiiilñnnnøôrtttz
     */
    ?>

I tried the I18N_UnicodeNormalizer from the PHP PEAR project, and it didn't do what I wanted.

    <?php

    require_once('I18N/UnicodeNormalizer.php');

    print I18N_UnicodeNormalizer::toNFD($i18n) . "\n";
    print I18N_UnicodeNormalizer::toNFC($i18n) . "\n";
    ?>

There's a good chance I don't know what I'm doing there with the PEAR library, however.

Link | Leave a comment {6} | Add to Memories | Tell a Friend

Unicode hex in PHP string

May. 27th, 2009 | 08:32 pm

In Emacs, insert UTF-8 hex value for a PHP string of the character at point.

(defun php-hex-for-char ()
  (interactive)
  (insert
   (mapconcat (lambda (x) (format "\\x%02X" x))
              (encode-coding-char (char-after (point)) 'utf-8)
              "")))

Lisp lifted from `describe-char' and `encoded-string-description'.

Link | Leave a comment | Add to Memories | Tell a Friend

Shell hack: Files with some DOS lines

May. 19th, 2009 | 11:22 am

I came across a project whose source code contains both DOS text files and Unix text files. Some of the Unix files contain carriage return line endings. Though, perhaps they were DOS files with Unix end lines! I wanted to suggest converting those files with mixed line endings to Unix.

Sometimes, the file command is helpful for showing what files have a mixed end of line style, but not always. For example, the file command will say "ASCII C program text, with CRLF, LF line terminators". That's perfect. However, sometimes the command just says, "PHP script text".

I wrote this find expression that would get files that contain DOS carriage returns, but not entirely DOS files.

$ find -type f -execdir grep -qe '^V^M$' {} \; \
       ! -execdir awk 'BEGIN{is_dos=1;}!/\r$/{is_dos=0}END{exit(!is_dos);}' {} \; \
       -print

The above doesn't work, since many DOS files don't end in a newline (and without a carriage return) as they do for Unix text files.

Awk obviously considers the last line as a line, but since there's no carriage return the file is not considered a DOS file based on the logic I've written. This results in a false negative.

This change to the Awk script makes this hack work as it should.

$ find -type f -execdir grep -qe '^V^M$' {} \; \
       ! -execdir awk 'BEGIN{is_dos=1;}
                       !/\r$/ && is_dos{is_dos=0;n=NR}
                       END{exit(!is_dos && n != NR);}' {} \; \
       -print

Link | Leave a comment {1} | Add to Memories | Tell a Friend

Send Amazon's Bezos some peaches

May. 13th, 2009 | 04:32 pm

I just noticed there was a great action by the FSF's anti-DRM campaign last month against Amazon's Kindle electronic book. See Defective by Design: Impeach Bezos for Amazon's Kindle Swindle. The idea is to send baby food peaches to Jeff Bezos for having terms that allow Amazon to deny customer's access to read their electronic books.

Sending baby food is pretty easy in the age of the Internet, see the link and instructions at the end of the post.

Link | Leave a comment | Add to Memories | Tell a Friend

Change log entries for HTML files

May. 5th, 2009 | 01:01 pm

Someone asked me if there was a good way to annotate the changes of an HTML file. It sounded like the person had to maintain some legacy, HTML-hell, home-brewed, template files for some business Web site.

I suggested using the ChangeLog support of Emacs, and using HTML comments to organize sections of an HTML source file. Here's a simple, made-up example of such an HTML file.

<html>
<head>
<title>Sample only</title>
</head>
<body>
<!-- begin header -->
<p>[ <a id="top" href="#bottom">bottom</a> ]</p>
<!-- end header -->
<h1>Sample title</h1>
<!-- BEGIN: PAGE_CONTENT -->
<div>
<p>Testing.</p>
</div>
<!-- END: PAGE_CONTENT --
  -- footer-bottom start -->
<p>[ <a id="bottom" href="#top">top</a> ]</p>
<!-- footer-bottom end -->
</body>
</html>

Unfortunately, support for either the above sectioning style, or even another alternative, is not provided by the HTML mode that ships with Emacs. This is understandable because there is no consistent standard of doing this, and people use other variations than even those covered in the example. Not to mention, HTML comments are used for other reasons than naming regions of the file.

Regardless, I've put together the following regular expression for add-log-current-defun-header-regexp. It handles the cases in the example above. It is set for all buffers using HTML mode. Just put the following in your .emacs file.

    (add-hook 'html-mode-hook
        (lambda ()
          (make-local-variable
           'add-log-current-defun-header-regexp)
           (setq add-log-current-defun-header-regexp
               (concat "^[ \t]*<?!?--[ \t]*\\(?:begin\\|BEGIN\\|start\\)?"
                       "[ \t:]*\\([-_[:alnum:]]+\\)"
                       "[ \t]*\\(?:begin\\|BEGIN\\|start\\)?[ \t]*--"))))

Use it by typing `C-x 4 a' (add-change-log-entry-other-window). An entry like the following will be added in a nearby ChangeLog file:

2009-05-05  Aaron S. Hawley  <aaronhawley@livejournal.com>

        * file.html (PAGE_CONTENT): Add a test paragraph.
        (footer-bottom): Added link to "#top".

This setup will work for most cases except for scenarios where there is nested sectioning or where you've run `C-x 4 a' from a point outside of a "section" and get a false-positive.

Link | Leave a comment {2} | Add to Memories | Tell a Friend

Shell hack: Avoiding built-ins

May. 1st, 2009 | 04:40 am

To avoid using a builtin command of a Bourne or Bash shell in a shell script, one can use the full path of the executable command. For example, rather than

$ echo Hello, World\!
Hello, World!

you could

$ /bin/echo Hello, World\!
Hello, World!

Here's a way to show the difference--and make fun of the GNU coding standards at the same time.

$ echo --version
--version
$ /bin/echo --version
echo (GNU coreutils) 6.12
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3 : GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Brian Fox and Chet Ramey.

I prefer to use exec than using the full path for a command so that the PATH environment variable is used, and avoid the day should the full path to a binary change some day.

Unfortunately, a consequence of exec is that it runs the command in the current process and therefore will exit on completion, thus cutting short the life of your shell script. To avoid that, just wrap an exec statement in a sub-shell by using parens:

$ ( exec echo --version )
echo (GNU coreutils) 6.12
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3 : GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Brian Fox and Chet Ramey.

I have never seen this written in a script before. Perhaps, there's another way--that's a bit more canonical--to do this. This construct is entirely redundant and contradictory--"exec something in the current shell, but also in a sub-shell". Further, it's probably pretty much always the case to opt for the shell built-in. There are zero to no cases where you want to avoid the built-in. My only scenarios are timing processes in the shell.

According to the Limitations of Shell Builtins section of the GNU Autoconf manual,

When it is desired to avoid a regular shell built-in, the workaround is to use some other forwarding command, such as env or nice, that will ensure a path search:

          $ pdksh -c 'exec true --version' | head -n1

          $ pdksh -c 'nice true --version' | head -n1
          true (GNU coreutils) 6.10
          $ pdksh -c 'env true --version' | head -n1
          true (GNU coreutils) 6.10
     

That manual has everything it it. I guess I'll go with env, doesn't sound as nice as "exec", but it's a good mnemonic since it use the environment's path variable to run the command.

$ env echo --version
echo (GNU coreutils) 6.12
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Brian Fox and Chet Ramey.

Link | Leave a comment {6} | Add to Memories | Tell a Friend

Shell hack: Date work

Apr. 26th, 2009 | 01:21 am

Needed to make some Apache redirects for some links on a unix user's group Web site I maintain. The new site is based in a Wiki, and a member of the group moved all the pages with meeting announcements by hand using more readable page names. The old pages had the data as a four-digit year, two-digit month followed by the two-digit day (for example, 20061219). The new pages have the spelled out version of the week day and month (for example, Tuesday, December 19, 2006).

Here's a sample of what I needed for the .htaccess file.

Redirect /group/meeting-20061219.html   http://host.org/group/wiki/index.php/Tuesday,_December_19,_2006
Redirect /group/meeting-20070417.html   http://host.org/group/wiki/index.php/Tuesday,_April_17,_2007
Redirect /group/meeting-20070515.html   http://host.org/group/wiki/index.php/Tuesday,_May_15,_2007
Redirect /group/meeting-20070717.html   http://host.org/group/wiki/index.php/Tuesday,_July_17,_2007
Redirect /group/meeting-20071128.html   http://host.org/group/wiki/index.php/Wednesday,_November_28,_2007
Redirect /group/meeting-20080618.html   http://host.org/group/wiki/index.php/Wednesday,_June_18,_2008

I could do this by-hand, but I'd rather get a shell script to do it right, the first time. I found it easy to do with an extended Grep expression, awk and the date command that comes with GNU coreutils.

$ ls -1 \
  | grep -Ee '[0-9]{8}.html$' \
  | perl -pe 's/([0-9]{4})([0-9]{2})([0-9]{2}).html$/$&\t\1-\2-\3/' \
  | awk '{printf $1 "\t";
          system("date +\"%A,_%B_%e,_%Y\" -d "  $2);}' \
  | awk '{print "Redirect", "/group/" $1,
                "http://host.org/group/wiki/index.php/" $2;}'

I'm thankful I consistently used a file naming convention with the old site.

Link | Leave a comment {8} | Add to Memories | Tell a Friend

Database programming

Apr. 23rd, 2009 | 07:33 pm

This simple bit of PHP made some changes to a MediaWiki installation for me recently. Don't use it. My point here is in showing the satisfaction from changing a database with code that generates the SQL statements for you. The golden rule for database applications is to make sure there's an interface or programming layer to the actual database to preserve data integrity and keep changes limited in scope. One should use some of the scripts that come with MediaWiki--batchMove.php or namespaceDupes.php for example--to do this. Such scripts are better trusted, but also handle the schema should it change in a later release of the software.

Now that the disclaimer is out of the way, I was pleasantly surprised how consistent the schema for MediaWiki was for allowing me to rely on simple data structures and some for-loops to update 9 tables. Clearly, the schema isn't entirely normalized, but I predict there is a rationale for having some of the data denormalized. Given this scenario, it is a real compliment to a software package and its schema if one can write very concise code to generate a series of SQL statements for a task.

<?php
// Move 3 pages, and start 2 namespaces.  Don't use this code!

$table = array('page' => 'mw_page',
	       'rc' => 'mw_recentchanges',
	       'pl' => 'mw_pagelinks',
	       'pt' => 'mw_protected_titles',
	       'qc' => 'mw_querycache',
	       'qcc' => 'mw_querycachetwo',
	       'rd' => 'mw_redirect',
	       'tl' => 'mw_templatelinks',
	       'wl' => 'mw_watchlist');

$namespaces = array(100 => 'ThisWiki',
		    /* 101 => 'ThisWiki talk', */
		    110 => 'RPM',
		    /* 111 => 'RPM talk' */);

$rename = array('Changes_to_Wiki_database' => 'ThisWiki:Changes_to_database',
		'LocalSettings.php' => 'ThisWiki:LocalSettings.php',
		'Changes_to_Monobook_skin'
		  => 'ThisWiki:Changes_to_Monobook_skin');

foreach ($table as $short => $t) {
  // Move:
  foreach ($rename as $orig => $new) {
    foreach (array(0 => '', 1 => 'Talk') as $old_ns => $old_name) {
      printf("UPDATE %s\n"
	     . "SET %s_title = '%s'\n"
	     . "WHERE %s_title = '%s' AND %s_namespace = %d;\n",
	     $t, $short, $new, $short, $orig, $short, $old_ns);
    }
  }
  // Fix namespace:
  foreach ($namespaces as $num => $name) {
    foreach (array(0 => '', 1 => 'Talk') as $old_ns => $old_name) {
      printf("UPDATE %s\n"
	     . "SET %s_title = REPLACE(%s_title, '%s:', ''), "
	     . "%s_namespace = %d\n"
	     . "WHERE %s_title LIKE '%s:%%' AND %s_namespace = %d;\n",
	     $t, $short, $short, $name, $short, $num + $old_ns,
	     $short, $name, $short, $old_ns);
    }
  }
}
?>

It's dangerous to show this code, because someone may come to this page after a Web search and erroneously think this is the way to introduce namespaces in MediaWiki or something. I'll say it again: Don't use this!

Unfortunately, generating SQL statements by-hand is a dirty little secret of database maintenance. I've seen database maintenance done by evaluating SQL commands one-at-time more often than I'd like. Instead of being programmatic and more efficient, "easter egging" methods often introduce typos by either rousing "copy and paste" hell or a "search and replace" hell, and therefore risks typo errors and who knows what else. Writing database maintenance scripts programmaticaly enables you to work as a single transaction and study the results in each iteration. This quality forces one to use a test version of the data and avoid another database faux pas -- working on live databases.

Link | Leave a comment | Add to Memories | Tell a Friend

Study Emacs with Lisp or natural language?

Apr. 18th, 2009 | 11:49 pm

There was a back and forth between Emacs bloggers Jared Dilettante and Ian Eure about whether Emacs users should write Emacs Lisp code for their own purposes or whether working in Emacs Lisp should primarily be in support of existing modes--and with the mode's documentation thoroughly read. At this point, the argument has fizzled out, but its worth pointing out that a manual for SQL mode doesn't seem to exist. SQL mode, like most Emacs modes, is self-documented well. However, this argument is evidence that it could probably use a manual.

As someone who's neither a trained writer nor a good one, I know that writing in one's own spoken language is difficult. However, writing concise documentation about Emacs can be just as important as writing Emacs Lisp code, if not maybe more so. Writing documentation will give you a deeper understanding of how something works and help you learn things you didn't already know. It's also important because the documentation you write will help someone else to learn how to use Emacs, too.

Documenting Emacs also improves your Emacs Lisp skills because you'll likely be reading other hacker's Emacs Lisp code. Most important of all you'll be working on existing code for Emacs rather than making more when it is not entirely necessary to reinvent the wheel.

Don't get me wrong. It's fine to twiddle code. Reinventing the wheel is the basis for higher education and university study and critical to life-long learning. However, writing code and championing your work as a solution only acts to avoid studying and maintaining existing code and risks distracting the Emacs community from progress. In a way, it's almost anti-social.

I know some Emacs hackers think multiple implementations are important, and "I wouldn't have learned as much if I didn't manage my own Emacs package". These arguments are spurious, though. One should be able to earn these same benefits--and more--by joining in on an existing project, rather than forking a new one. It may take more effort but its the right thing to do and is under my column of "best practices".

The GNU General Public License gives the Emacs hacker a lot of freedom, but we all could still use the occasional self-discipline.

Link | Leave a comment {4} | Add to Memories | Tell a Friend

Cheap tricks in Emacs: Elisp manual

Apr. 13th, 2009 | 01:12 am

Another subtle feature of Emacs that I hope doesn't go away any time soon is an easy way get to the Elisp manual. The key sequence is `C-h r TAB RET'.

Here's how it works. In Emacs 22, a new key binding appeared to access the Emacs manual, `C-h r'. Conveniently the first cross reference in the Emacs manual is "See Emacs Lisp(elisp)". Hitting `TAB' skips you to the first cross reference, then `RET' follows the cross reference.

Link | Leave a comment | Add to Memories | Tell a Friend

Advertisement

Customize