February 18, 2005

I'm Going Flickr

Biblioblogs, in a fantastic way, allow senior, junior, and wannabe scholars to engage others in current issues in the field. The newest-at-the-top format indigenous to the blog places a great emphasis on new material. The downside is that the delightful gems tucked away in the archives are sometimes "out of sight, out of mind". A real indicator of this is the plethora of broken image links, particularly those blogs hosted at the blogging service web site.

Roy Brown emailed me a couple '04 photos this week, which prompted me to go looking for others. Jim Davila's SBL pics on his biblioblog had broken links--no unusual occurence in web space. And this prompted me to take a look at what flickr.com has to offer. It is the most popular image hosting service, and I think it offers some splendid features. I have a few ideas for how biblioblog authors and readers might benefit:

  • For any images you add to your photostream that you deem of interest to bible scholars in general
    1. add them to the biblioblog group at Flicker, so that anyone interested can go and view the set of images made available to the group
    2. and even more exciting, give them descriptive "tags" so that they'll be found when searched for
  • Here's where the real benefit comes in, I think. If we agree on some standard tags (within this amoeba of ), you can subscribe to feeds to let you know when a new image has been posted with a specific tag. For instance, let's say we use the tag "Biblioblog" to give to images we think bible scholar/biblioblogging/e-list types might be interested in. Then, simply subscribe to the RSS 2.0 or Atom feed for "biblioblog" and you get an update whenever an image has been placed on Flickr with that tag. (Update: You can also subscribe to a feed for the Biblioblog flicker group, and in that way recieve notice when a photo has been added to the group itself.)
  • The benefits get even more absurd when we combine this with the brilliance of Technorati.com. Remember this all started with my search for SBL photos? Well, just have a look at the tag "SBL" at Technorati.com. It collects weblog articles, Flickr images, and del.icio.us/Furl bookmarks that possess the same tag, offering a compiled sampling on a given topic. (Have a look at their "What's a tag?" link for more info.)
  • But back to the more basic, Flickr gives me a static place to host images I use in my web pages
  • And on a personal note, I can create any number of custom sets of images offering, for instance, a static link to pics of my kids that my parents just love.
Those of you who read AKMA, probably most, are familiar with the whole saga of his love/hate relationship with . If you're not familiar with Flickr in particular, here is a recent post touting its benefits.
The restrictions for a free account seem reasonable.
  1. No limit on how much "space" you have
  2. You can upload a max of 10 megs per calendar month
  3. Don't go 90 days without any activity, or your free count will go defunct.

Well, all this was just to encourage, if you're not 100% set on how you host your web images, it seems to me you would do well to check into Flickr.

February 10, 2005

Unicode Characters of Interest to Bible Scholars

The major Unicode character blocks of interest to bible scholars include: Combining Diacritical Marks Greek and Coptic (Coptic has its own range in the next Unicode update) Hebrew Greek Extended Scroll down for a test page for some of the more obscure Unicode characters from other character blocks of interest to bible scholars, including the newly approved GNT Unicode Sigla. The Cardo font and the New Athena font are probably the easiest to obtain that currently has the range of character forms.

TestHexUnique Character Name
U+2E00RIGHT ANGLE SUBSTITUTION MARKER
U+2E01RIGHT ANGLE DOTTED SUBSTITUTION MARKER
U+2E02LEFT SUBSTITUTION BRACKET
U+2E03RIGHT SUBSTITUTION BRACKET
U+2E04LEFT DOTTED SUBSTITUTION BRACKET
U+2E05RIGHT DOTTED SUBSTITUTION BRACKET
U+2E06RAISED INTERPOLATION MARKER
U+2E07RAISED DOTTED INTERPOLATION MARKER
U+2E08DOTTED TRANSPOSITION MARKER
U+2E09LEFT TRANSPOSITION BRACKET
U+2E0ARIGHT TRANSPOSITION BRACKET
U+2E0BRAISED SQUARE
U+2E0CLEFT RAISED OMISSION BRACKET
U+2E0DRIGHT RAISED OMISSION BRACKET
U+2E0EEDITORIAL CORONIS
U+2E0FPARAGRAPHOS
U+2E10FORKED PARAGRAPHOS
U+2E11REVERSED FORKED PARAGRAPHOS
U+2E12HYPODIASTOLE
U+2E13DOTTED OBELOS
U+2E14DOWNWARDS ANCORA
U+2E15UPWARDS ANCORA
U+2E16DOTTED RIGHT-POINTING ANGLE
U+2E17DOUBLE OBLIQUE HYPHEN
U+2135ALEF SYMBOL for Sinaiticus
These last few symbols are from the Plane 1 range. Cardo contains these, but for an ambitious font for more code points in this obscure plain than you can imagine, try the Code2001 font.)
��U+1D459MATHEMATICAL ITALIC SMALL L for lectionary MS
��U+1D516MATHEMATICAL FRAKTUR CAPITAL S for Septuagint
��U+1D510MATHEMATICAL FRAKTUR CAPITAL M for majority text
��U+1D513MATHEMATICAL FRAKTUR CAPITAL P for papyri
��U+1D52DMATHEMATICAL FRAKTUR SMALL P for a small letter papyri

February 09, 2005

Unicode: A Bible Scholar's Introduction

An Introduction to Unicode

This articles walks through the concept of Unicode as a means for handling Greek (and Hebrew, as well). I have in mind scholars who have had little success understanding or appreciating the movement to Unicode fonts and texts in their work. A more precise title of this document might be "An Introduction to Unicode Greek on Mac OS X". However, this conceptual introduction will be helpful to those using other OSes, including Windows and Mac OS 8 and 9. Also, while the examples I give center on Greek, they could easily use Hebrew or any other language as well. The level of discussion is designed for the bible scholar who considers themselves a non-techie-- whose response to Unicode so far has mostly been, "Uni-what?" This first article centers only on beginning concepts, in contrast to the old system we are familiar with. I will later move towards "trying things out" on your computer. (If the Unicode characters on this web page are not displaying correctly, and you're using OS X, use Safari or Firefox, not Explorer. If you're running OS 8 or 9, have a look here to get you started.)

The Way We Were: Legacy Fonts

For years, we typed our Greek (and Hebrew) with the same ASCII characters we use for English (or other indigenous language). In order to achieve the correct appearance, we would switch the font of the Greek text to a font that used Greek character forms. Each different font (Helena, SuperGreek, Graeca, TekniaGreek, SBL Greek, etc.) had its own unique character associations, although the primary letters were mostly the same: an "a" became an Alpha; a "b" became a Beta. We typed in "abba" and it became "αββα". Once you typed in the letter "a", the computer encodes that key press and stores it in memory as ASCII character number 61, and then you can change fonts in your program to customize how that specific ASCII character is rendered, as an Alpha or an Aleph or a goofy glyph from Zapf Dingbats if you so desire. Hence, you had to change fonts for each change in language.

Accents, breathers, vowel points and other diacriticals were added by assigning otherwise unneeded letters and punctuation to non-spacing versions of those diacriticals that combined to create the character form desired. This is where the keyboard layouts associated with each font really began to differ. An Alpha with an Acute accent (ά) might be typed as "av" or "a/" depending on what font you were going to use. Hence, a text created in one font system needs that font on any computer that tries to display or print the text. Some technologies for embedding fonts into documents (PDF, Powerpoint, Word) are an attempt to help the situation. Another limitation is that the font you choose ends up dictating the keyboard layout you use to enter the text.

This western-centric system really begins to break down once we take a global perspective on characters. Ask yourself which Roman letter and its associated ASCII number should we assign to ཛྷ the Tibettan Dzha or す the Japanese Su? In languages with larger numbers of character forms, you quickly ran out of options since the extended ASCII range is limited to 256 characters.

Another limitation is really shown in texts at the digital level. You can't create a web page that uses a specific Greek font, because a large percentage of visitors won't be using computers that have that font installed. And consider searching... it didn't take long for compiled documents and different documents that you maintained to use more than one different font system. So, you could try a search for τάς by searching for "tav"" or perhaps "ta/s" or maybe "tavV" depending on what font is being used... but good luck getting consistent and reliable results.

The "old" font system proves insufficient.

The Unicode Standard: Expanding the range

Unicode was invented by computer geeks, linguists and scholars as an evolving solution to creating/encoding text in a multilingual world. Unicode is based on a system similar to the original ASCII system, where each character is encoded as a unique number called a character code point, except that the range of numbers is greatly expanded. Instead of just 128 characters (or 256 in extended ASCII ranges), Unicode currently has room for around a million different characters. The intent is to have space to assign a unique number to every known character in human history, in addition to supplemental drawing lines ┎, mini pictures ✈, icons ⌥, symbols ϗ, dingbats ☚, etc.

(including a small collection of crosses and religious symbols: ✙ ✚ ✛ ✜ ✝ ✞ ✟ ✠ ☓ ☦ ☧ ☨ ☩ ♰ ♱ † ‡ ⁜ ☥ ✡ ☪ ☫ ☬ ☰)

The Unicode ranges are divided into Character Blocks, each associated with different languages or a type of punctuation or symbol.

Code points in the range of decimal numbers 1 thru 128 are English letters (Basic Latin technically) and punctuation, actually identical to the old ASCII number assignments which increases backwards-compatibility. So, "a" is 61, "b" is 62, and so on. Code points 768-879 contain combining diacriticals.

Code points 880-1023 are the basic Greek letters and symbols. By "basic", I mean modern Greek letters with no diacriticals other than a simple tonos accent or diaeresis. The "α" is decimal number 945, "β" is 946, etc. Code points 1424-1535 are the Hebrew character block. Code points 7936-8190 are the Extended Greek range. These are characters with the full set of accents and breathing marks added on. The range 8592-8703 presents various types of arrows, for instance. Code points 11392-11519 are Coptic characters (which used to be conflated into the Basic Greek range). The range 65536-65786 is Linear B. Well, you get the idea. In addition to the unique numbers given for each language or alphabet's character, the published Unicode Standard provides a unique name. The letter "a" is "Latin Small Letter A"; the letter "α" is "Greek Small Letter Alpha". These ranges are the decimal numbers for the character blocks, but Unicode code points are most often given in their hexadecimal equivalents (for computer reasons you really don't need to care about). So, a lower case Alpha is Unicode decimal 945, which is the same as hexadecimal 03B1 (normally given as U+03B1). All that is to illustrate the only thing you need to know, which is that each character is encoded in your document with its own unique code point. From now on, an "α" will never be mistaken for an "a", and an "a" will never be mistaken for an "α", and the "a" and the "α" can be side by side in a document, even in the same font!

The New Way of doing things: Unicode Code Points versus Font Glyphs

Unicode is actually much more than just an expanded range of characters. It represents a new way of approaching the relationship between the abstract concept of a character represented by a code point and the glyphs (character forms) that fonts use to represent the characters.

A code point is not precisely synonymous with a character. The code points from the combining or modifying diacritical ranges combine with the previous code point to create one text element, one character. So, U+03B1 α plus U+0301 ´ becomes ά which your Unicode-savvy application turns into one character, much unlike the old legacy font system where you could move your cursor through the text using an arrow key and see the insertion point pause between the letter and the accent, letting you know that there was really still two characters behind the one character being displayed. As I've indicated, the most powerful and benefitial aspect of the new approach represented by the Unicode Standard is that a lower case Alpha is always encoded with the same code point, so that no matter what font you are using or what font someone else uses to view your document or web page, a lower case Alpha will always be a lower case Alpha. A ἇ will always be a ἇ, and a ᾬ will always be a ᾬ. Now, just how the letters look will change with each font of course; you'll most certainly prefer the typography of the Greek character forms of some fonts better than others, but you'll never encounter the problem of needing a specific font to correctly render an Alpha loaded with diacriticals. You'll never run your Keynote or Powerpoint presentation on a classroom computer and encounter the problems of missing fonts. Truth is, often when you're reading a web page or other document that has Unicode Greek in it, you won't always know what font you are even using to view it. As I said, the Unicode character set is divided into Character Blocks associated with different languages (or some kind of widget or punctuation). But no Unicode font contains a character form for every code point in every Character Block. Some large Unicode fonts that come installed on the latest Mac OSX contain most Character Blocks, but still not all. Some specialized fonts, such as those made for Greek scholars, may only contain Character forms for a handful of blocks, such as some diacriticals, Greek Basic and Extended, Coptic, and a few more Character forms from within various Character Blocks that are typically used in Greek scholarship. So, what happens when you are reading a document whose text is set to a font which does not contain a Character form for the character you are encountering in the text? Your smart Mac by default goes searching in its font mapping database and displays that character in a font that contains a character form for that code point. For instance, let's say you find the word τᾷς. The ᾷ is the code point U+1FB7, which is from the Extended Greek Character Block. Your text is set to the Arial font, which does not contain that Unicode range. So, that character is displayed by another font that does contain the range. If you have OS X 10.2 or 10.3, and you have not installed any other Unicode Greek fonts, I can assure you that the ᾷ is being displayed by the Lucida Grande font, because that is the only font that comes on a standard install that contains all the characters from the Extended Greek Character Block. Some specialized applications (such as Mellel) can always override this default substitution behavior, so that you'll see a garbled character until you manually change the font to one that contains a Character form for that code point. Firefox also can stymie font substitution on the fly. This highlights another powerful aspect of the Unicode approach to characters... each code point for every character under the sun within the Unicode Standard is an abstract concept, and it leaves the issues of what glyphs are used to display that character up to the font an application and OS that is using the encoded text.

Well, I think I'll end the abstract introduction here. I hope to offer concise summaries of further issues regarding Unicode Greek on your Mac. If you have requests for topics to cover, be sure and send them along.

Future ideas for articles on Unicode

  • Unicode Tutorial -- a brief glance on your Mac OSX
  • Embedding Unicode into your html documents and blog entries
  • Unicode Keyboards -- typing in your language
  • Unicode Fonts -- displaying the character the way you want
  • Unicode tools, helps, and other links
  • Understanding Extended Greek letters and combining diacriticals
  • Converting your "old" style, legacy font Greek documents into Unicode
  • Unicode Tables of interest to bible scholars
Note: If you are uncomfortable with the way I have glossed over some issue or made a statement that is "not quite right actually," then this document is probably not for you. For instance, I know the different between a code point and which encoding a document uses to encode that code point. I've ignored the issues regarding Combining diacriticals and precompiled forms and Normalization standards C & D. I may address some of the issues in a followup discussion. It's just beyond the scope of this basic introduction. If some part of this document is confusing, or you have a suggestion for how to explain a concept, please pass it along that we might improve the explanations given. Last updated: 2/11/05

February 08, 2005

Mellel and Bookends Comments

Danny Zacharias at deinde.org has drawn my attention to his thoughts about Mellel, the very promising word processor that scholars would do well to know about, and the bibliographic software Bookends. As I have said to the many of you asking my opinion and review of word processor options, I'm withholding significant comments on word processor comparisons, because I see us in a state of transition right now. You can read the full post here at deinde, but I'll highlight a few comments. The post is not so much a full review, as it is a testimonial for choosing the applications, highlighting a few key advantages for Mellel and Bookends. One is the price tag. Another is the tremendous responsive support I, too, have personally witnessed from the folks at Redlers. They are very active in support forums and email responses. Danny also says,

Mellel is a word processor for writers and scholars. It does not have frills like inserting clip-art and inserting drawings, etc (although these things can be done). Mellel is designed for paper writing, and specially equipped to handle large documents, especially book/dissertation writing, etc.
This really goes along with the smaller price tag. It is a smaller program. It feels smaller. The only snag here, is you never know what "frills" any given user is going to miss. Danny mentions that he misses the voice recording feature found in Word. I've never used the voice recording feature in Word since the day I recorded my dog barking on Word 5.1a in order to scare my cat at random intervals. Danny also mentions the customizeability of Mellel, but this is more a feature that is up to par with other standards, not something that puts it ahead of the game. In Word, for instance, you can do anything you want, and I mean anything you want, with a simple keystroke if you're willing to record the VBA script and assign a keystroke. This leads me to a MAJOR problem with Mellel. Mellel isn't scriptable! It has no Applescript functionality, or other macro solution (at least a search through the help files found none). All I can say is "What a shame." There's no chance I'm going to use a word processor that doesn't allow me to automate repetitive tasks. The only reference to Applescript in the current (1.8.2) pdf manual is a negative remark about scripting. Well, one can of course devise some work-arounds using 3rd party solutions or GUI scripting to accomplish repetitive tasks, but this is no solution for me. The discussions on the Mellel forum seem to indicate Applescript is a priority for the developer. That is promising to hear, and I await it. Also, I have a great appreciation for Mellel's approach to Unicode fonts... it overrides the default behavior of changing display fonts if the current font does not contain that character. But, as the developer shared with me in an email, there's no reason to deny the user this smart-technology. It doesn't prevent the user from then still choosing the best font for displaying their Hebrew or Greek. Despite those caveats, don't forget, Word can't do Unicode Hebrew, while Mellel on the other hand is indigenous to the language. Hopefully someday I'll do a review of the app myself, where I'll say things like, "Thank you for adding some side border to the window pane." I'd like to spend more time with Mellel, but the demo for the beta version I was sent ran out on its time limit before I got around to doing much with it. Update: I just received the newest beta for Mellel. It seems to include many added features, very promising. It includes a brilliant, best-of-both-worlds solution to the Unicode font substitution I've mentioned earlier, and discussed with a Redler developer months ago. I'll not discuss them here because I don't disclose features of software I have a beta copy of. I find it interesting that I recieved the beta version not 30 minutes after posting this blog entry about Mellel. The trial period is still expired though... going to be a bit annoying to try out this way. As for Bookends which I've not yet looked at, the main thing I have to say is if you haven't begun exploring the use of a bibliographic manager for your work, "shame on you". I'll say more about that on another day. The second thing is that if you are using Mellel, something that may very well be a great idea, you should also choose Bookends. Intentional integration at the programming level can not be surpassed.

February 05, 2005

Book Meme

Oh, why not.

  1. Grab the nearest book.
  2. Open the book to page 123.
  3. Find the fifth sentence.
  4. Post the text of the sentence in your journal along with these instructions.
  5. Don’t search around and look for the “coolest” book you can find. Do what’s actually next to you.
Here's what I got, from Questioning Q: A Multidimensional Critique edited by Mark Goodacre and Nicholas Perrin:
A decent case could be made for the Matthaen origin of this spelling, but it is a case that cannot be made on the basis of the data considered by the IQP.
Technically, my GNT anniversary edition was closer to hand, but I didn't want to come across as a zealot by giving:
οἱ δὲ μαθηταὶ ἐθαμβοῦντο ἐπὶ τοῖς λόγοις αὐτοῦ.
from Mark 10:24. Via The Coding Humanist

February 01, 2005

Applications to Know About

Many lists of applications and utilities, from the intentionally selective to the exhaustively exhaustive, are available out there in the webworld and blogdom alike. I thought it would be good to present here a page of applications and utilities that are potentially of specific interest to bible scholars. Instead of beginning with long list, I'll highlight a single app from time to time, and add it on. I intend to omit the obvious, and focus on rare/niche utilities that have specific utility for those in our field, but applications that are just sooo cool and breakthrough-ish regarding how you work are fair game as well. Please make your own suggestions for Mac applications that fit these categories in the comments. The more input we receive, the more we all benenfit. 1. First on the list is a new innovation I learned about this week--a library database manager that allows you to enter your books by scanning the barcode with your Quicktime compatible webcam/digital camera/camcorder. One option is Delicious Library. Delicious Library manager specializes in simplifying the input of your books (and CD's, etc.) by scanning the barcodes with your iSight or any other Quicktime-supported web cam or camcorder. If you have a camera you can video chat on your computer with, it will most likely work. It scans the barcode, then retrieves all available data from the Amazon.com database. The library file is stored in an XML format, so export possibilites are rather unlimited. So, if you've decided to update your library file, while this won't help with the oldest members of your collection, what a time saver otherwise. But, at $40, I can't say I'm recommending it without reservations. It does have a downloadable demo, so you might find it's just up your alley. Thanks to Merlin Mann for his review of Delicious Library as well. Update: Another option, recommended by Chris is Booxter. Booxter seems to be as fully featured as the previously mentioned alternative, with a cost of $15. The extra features on Delicious Library (such as the "To Whom Have I Loaned this Book" feature) are rather superfluous, IMHO. I'd like to learn more about the export format possibilities for Booxter. The goal for me is to have a means of entry, and then export a data file that I can then use elsewhere, such as Endnote or in my own custom database. My apologies if all these apps are , but I so rarely used Classic these days.