[ Wren's log of random thoughts on his f/oss projects ]

Sunday, 8 June 2008 : The PLS group are made of awesome.View as RSS

Now that the summer's here I've been hoping to get back into posting more regularly. Over the last year I've accumulated a large number of "to read" papers on various topics having to do with functional and logic programming. And now that the summer's here I'm finally getting a chance to go back and read them. That sounds like two great tastes that taste great together, so let the linkblogging begin.

Functional programming is historically maligned with having less efficient implementations than imperative languages do, mostly because historically that's been the case. Note the emphasis on "historically". However, if you're not careful, you may be perpetuating that sad state of history. If you think —being a non-academic programmer— that academic papers don't affect you, quoth dons on #haskell: programming is not for the naïve.

In functional languages the list data structure is foundationally ubiquitous, and in lazy languages like Haskell they also abstract over the recursion pattern of loops in imperative languages. Unfortunately this excessive use of lists places an enormous burden on allocating and reclaiming memory, which is a major source of historical inefficiencies in functional language implementations. Fortunately, much research has been done to overcome these issues... if you take advantage of it.

One of the great claims of high-level (mathematically-rigorous) languages is that the altitude gives the programmer the ability to program prettily and idiomatically, while making the true invariants more visible to the compiler thus allowing it the freedom to optimize as far as (or further than) illegible programming would've let you. Today's papers are about approaching that goal through "deforestation" techniques which are a subset of "fusion" optimizations. All of today's papers are about automating these optimizations, but compilers are complex and it takes time to work current research into them. So today's post is also a PSA: only you can start forest fires, use libraries which optimize your lists away.

@InProceedings{ CSL2006:rewriting,
    author    = "Duncan Coutts and Don Stewart and Roman Leshchinskiy",
    title     = "Rewriting {H}askell {S}trings",
    booktitle = "PADL 2007: Practical Aspects of Declarative Languages
                 8th International Symposium",
    year      = 2007,
    month     = jan,
    pages     = "50--64",
    publisher = "Springer-Verlag",
    url = {http://www.cse.unsw.edu.au/~dons/papers/CSL06.html},
    url = {http://www.cse.unsw.edu.au/~dons/fps.html},

    Abstract = {
    The Haskell \textit{String} type is notoriously inefficient.
    We introduce a new data type, \textit{ByteString}, based on
    lazy lists of byte arrays, combining the speed benefits of
    strict arrays with lazy evaluation. Equational transformations
    based on term rewriting are used to deforest intermediate
    ByteStrings automatically. We describe novel fusion combinators
    with improved expressivity and performance over previous
    functional array fusion strategies. A library for ByteStrings
    is implemented, providing a purely functional interface, and
    approaches the speed of low-level mutable arrays in C.}
}

If your code does any string processing of note, you'll want to use Data.ByteString available on Hackage. This library provides the standard C implementation of unboxed arrays of characters, but with the standard Haskell interface of high-level functions for treating them like lists. This paper is also one of the best introductions to the field of deforestation I've seen so it's a good read to get a handle on the issues and the solutions out there.

@InProceedings{ Gill93:shortcut,
    author    = "Andrew Gill and John Launchbury and Simon~L. Peyton Jones",
    title     = "A Short Cut to Deforestation",
    booktitle = "FPCA '93: Conference on Functional Programming
                 Languages and Computer Architecture",
    location  = "Copenhagen, Denmark",
    pages     = "223--232",
    month     = jun,
    year      = 1993,
    publisher = "ACM Press",
    address   = "New York, NY, USA",
    
    isbn = "0-89791-595-X",
    url  = {http://citeseer.ist.psu.edu/gill93short.html},
    url  = {http://portal.acm.org/citation.cfm?id=165214},
    doi  = {http://doi.acm.org/10.1145/165180.165214},
    
    Abstract = {
    Lists are often used as ``glue'' to connect separate parts of
    a program together. We propose an automatic technique for
    improving the efficiency of such programs, by removing many of
    these intermediate lists, based on a single, simple, local
    transformation. We have imoplemented the method in the Glasgow
    Haskell compiler.}
}

This is an older paper and one of the first to present a general method for all permissible programs rather than a more restricted subset. The optimizations mentioned here are already implemented for stock GHC lists so you don't even need a library to take advantage of them. This is more of a background paper for contextualizing the PLS papers.

@InProceedings{ CLS2007:stream,
    author    = "Duncan Coutts and Roman Leshchinskiy and Don Stewart",
    title     = "Stream {F}usion: From Lists to Streams to Nothing at All",
    booktitle = "ICFP 2007: Proceedings of the ACM SIGPLAN International
                 Conference on Functional Programming",
    year      = 2007,
    month     = apr,
    note      = "To appear",
    url = {http://www.cse.unsw.edu.au/~dons/papers/CLS07.html},
    url = {http://www.cse.unsw.edu.au/~dons/streams.html},
    
    Abstract = {
    This paper presents an automatic fusion system, \textit{stream
    fusion}, based on equational transformations, that fuses a wider
    range of functions than existing short-cut fusion systems. In
    particular, stream fusion is able to fuse zips, left folds and
    functions over nested lists, including list comprehensions. A
    distinguishing feature of the stream fusion framework is its
    simplicity: by transforming list functions to expose their
    structure, intermediate values are eliminated by general purpose
    compiler optimisations.
    
    We have reimplemented the entire Haskell standard list library
    on top of our framework, providing stream fusion for Haskell
    lists. By allowing a wider range of functions to fuse, we see
    an increase in the number of occurrences of fusion in typical
    Haskell programs. We present benchmarks documenting time and
    space improvements.}
}

This paper is a followup on Rewriting which seeks to apply the fusion techniques to all lists (though not the unboxed arrays part of Data.ByteString). This too is provided as a library Data.List.Stream also on hackage. The plan is for this library to eventually replace the whole Data.List module and to ship with GHC itself.

One of the reasons the PLS folks are made of awesome is the empirical benchmarking they give along with the theory. Sadly, due to some limitations in GHC's general optimization techniques, this library is still experimental and still somewhat hit or miss (as of 2007 at least). For the majority of programs there's marginal gain over the foldr/build approach, for some there's amazing gains (because foldr/build can't apply), but for some deeply nested loops those limitations in GHC means we can't completely clean up after ourselves and so there's actually performance degradation. The moral of the story is to profile your code and use this library whenever possible (and help the maintainers when it breaks). As for the profiling, stay tuned...

music of the moment: Di Me ~ Gipsy Kings (Compas)

categories: I for one welcome our functional overlords, links, haskell

Sunday, 25 April 2008 : .agenda

Another stage in The Plan has been revealed to me:

Previous stages I'm permitted to share with the uninitiated:

music of the moment: Happiness Is A Warm Gun ~ Tori Amos (Strange Little Girls)

categories: I for one welcome our functional overlords, links

Sunday, 16 April 2008 : Escaping monads and topological spaces of computation

So I got to thinking about monads for computation, in particular about the "no escape" clause. In practice you actually can escape from most monads (ST, State, List, Logic (for backtracking search), Maybe, Cont, even IO) and these monads all have some "run" function for doing so (though it may be "unsafe"). These functions all have different signatures because there are different ways of catamorphing the computation away in arriving at your result, but in principle it's possible to subtype these computations into having appropriate generic execution functions.

The simplest computation is a constant computation which successfully returns a value, ST and Identity are just such computations. One way of extending this is to say that a computation is nondeterministic and can return many answers[1]. A different way of extending it is to say that the computation might fail to return any value, Maybe is just such a monad. And of course we could take both those extensions and either fail to return any answers or return nondeterministically many answers, List and Logic are our examples here. Another extension is that a computation may have different kinds of failure, like (Either e), and this is where things start to get interesting.

To generalize what I've said so far we can talk about the topological space of a computation's result. For a computation m a a constant computation has a result in a (we can think of the type a as the alphabet of possible kinds of success). An infallible nondeterministic computation has a result in ak where k ranges over the positive natural numbers and indicates all the different ways we can succeed.

A fallible computation could be thought of as ak where k is either 0 or 1, and where the empty result signifies failure. A different way of thinking fallibility which doesn't rely on empty results is as e | a where e is our error type. From this perspective the difference between Maybe and (Either e) is that Maybe has a unit type for failures, whereas (Either e) has a vocabulary/alphabet of e kinds of failure. Simply-fallible nondeterministic computations like List have results in the space a(0|k) == () | ak. For nondeterministic computations which may have some variety of possible failures, their result is in the space e | ak for an e with cardinality greater than 1. (Infallible computations also have this form, but for an e of cardinality 0.)

To run a constant computation your function is of the type m a → a, easy as pie. For simply-fallible computations you need to have some default value to return on failure in order to make it safe, hence m a → a → a. For nondeterministic computations you need a way of combining successes, hence m a → (a → a → a) → a. And for simply-fallible nondeterministic computations you need both: m a → (a → a → a) → a → a (which should look suspiciously similar to fold, though we've abstracted away the mapping aspect of fold which allows us to inject the values of a into values of a space b). For computations which have a variety of failure modes, instead of a default value, we need a function from errors to default values.

But let's put aside the idea of escaping monads and go back to the more interesting track of the typology of result spaces for computations. From symmetry, one may wonder whether it's possible to have a computation which may fail nondeterministically many times, i.e. has results in the space ej | ak for some j larger than one. Certainly it's conceivable. In fact, one reasonable example of this shows up in things like compilers. A good compiler does not simply exit after finding one error but rather tries to continue as long as possible in order to return as many errors as it can. Since there are different varieties of errors, compilers have results in ej | a for some non-unit vocabulary of errors e and some vocabulary of compiled programs a (for compilers that batch separate compilation we may rather think of it as in the full ej | ak where each object file is a separate value in a).

With simply-fallible nondeterministic computations we noticed that a0 == ()1, one may wonder then how to think of many errors in a simply-fallible computation. I posit that one way of thinking of multiple unit failures is as aj for some negative natural number j. In some interesting areas of formal language theory and typology theory the idea of such negative length strings is explored in some depth, though unfortunately I'm not familiar enough with these extensions, but I'm intrigued by the connection if it does in fact hold. This whole thing is also somewhat evocative of imaginary and complex numbers, another area I'm insufficiently schooled in to make any astute observations.

One thing such theories allow is the intermingling of positive and negative letters (or real and imaginary parts). So far I've presented the space of computations as ej | ak but why do we need to have the disjunctive union there? Could a nondeterministic computation return both success and also failure? Usually when we think of nondeterminism we implicitly assume we're going to throw away all the errors, leaving zero successes as our only error result. There's no reason why we couldn't keep the errors as well as the successes, though there's some question about whether such a thing would be helpful as a particular model of computation.

Again I'd argue that such a thing may indeed be helpful. One of the core ideas in quantum mechanics is the notion that things can exist in superposition. Nondeterminism captures the idea of a superposition among many choices. The notorious example from quantum is that something may be in a superposition not only between different locations but also between existing and not existing (or living and not living for the sake of some poor felines). I'd posit that a computation which nondeterministically returns both success and also failure captures just such a situation. Again, I know too little about quantum to say much of import here, but I think the connection is real. With many kinds of failure (non-unit error type) and the ability to fail in many ways (non-unit exponent) we can capture an entire multidimensional space of nonexistence/failure which could go in superposition with all the kinds and ways of existence/success.

Fuzzy set theory and fractal decision boundaries (and hence all of chaos theory) are other areas which may benefit from the ability to have computations that succeed while failing. This in turn brings us back to the theory of computation. ej | ak obviously models the ability of a turing machine, but does ej ak have any extra formal power? That the latter model allows for contradictions (both succeeding and also failing at the same time) may seem to imply that it does, but of course our first-blush intuitions about formal power don't always match reality (e.g. nondeterminism only gives formal power at certain levels of the language/automata hierarchy, but not at other higher levels).

In all, an interesting thought experiment. It also makes clear to me that I need to learn more higher math in order to flesh out the connections to complex numbers, negative length strings, and quantum mechanics.

[1] Of which I can conspicuously think of no examples offhand which aren't also simply-fallible. I'm sure there're some out there. We could construct one by having a new kind of List which has a singleton constructor instead of nil, and some tree ADTs are similar, though I'm not sure how "natural" these are for expressing any particular variety of computation in practice.

categories: monads, theory of computation, computer science

Sunday, 25 April 2007 : On Composition and Syntax: Thoughts

One of the classes I'm taking this term alternates between Haskell and Smalltalk in trying to teach a bunch of seniors and graduate students "extreme programming" and how coding in the real world is different from in school. In one of the exercises we were working with an attempt to formulate Haskell-like tuples and lists in Smalltalk, in particular trying to debug the implementation we were given. We found numerous issues with the implementation, but one in particular has been nagging me. It indicates inherent limitations in the syntax of Smalltalk, but mulling it over, it seems to be an even deeper issue than that.

Part of the implementation for tuples[1] was overloading the comma operator (normally string concatenation) to create pairs like (a, b) or triples like (a, b, c) etc. The problem was this, how do we do tuples of tuples? Using code like ((a, b), (c, d)) does not give us a pair of pairs, but rather is equivalent to (a, b, (c, d)). I thought, at first, it was a problem of associativity; when the parser sees the second comma, the one after the b, it takes the preceding object and combines it with what follows, in effect it's acting like an operator for constructing lists. Reversing the associativity of the operator just gives us the same problem in the other direction yielding ((a, b), c, d). This is not an issue for Haskell because the parentheses are required and so they let us know for certain when we're done making a tuple. But in Smalltalk as with most languages, the parentheses are only there as suggestions on how to create a parse tree.

All this diagnosis I did for the exercise, but I've just struck something. There is a deep seated difference between "destructive" and "constructive" operations in any language. Operators like addition, subtraction, and so on are destructive operations. The results they come up with are different from the arguments they start with. When we say 3 + 5 there is nothing of the three or the five in our answer of eight; certainly the answer is derived from the inputs, but the 8 is a unique entity, we could have made it by adding two and six, or subtracting four from twelve, or any number of other operations. Destructive operations are a form of compression, we had two objects, now we have one, something was lost.

But there are other operations too, such as the comma we were trying to make and such as the colon, period, and dollar in Haskell. These constructive operators create a new value which is a composition of it's arguments or a container for them. The list constructed by 1:2:3:[] is not merely derived from the arguments given to the operators, those arguments themselves are a part of the list they are in, they are pieces of the composite whole, we can't make that list without those values (even if the exact sequence of operations to arrive at those values can vary).

This idea of constructive operations should not be new to any computer scientist. It is merely a rephrasing of the idea of abstract algebras. But the distinction between constructive and destructive operations should give light to the very real differences between abstract algebras and real ones. Generally, real and abstract algebras do not interact in the rarified world of mathematics, or to the extent they do they're considered isomorphic.[2] But in the real world, in computers, the two do coexist and they behave very differently.

The problem then is this: how to we define the boundary of a composition? For the case of tuples in Haskell the answer is to make the parentheses required on the periphery and forbidden within, thus in effect making a series of unique operators: the binary (,), the ternary (,,), the 4-ary (,,,), etc. But the problem is a systemic one, and it can be rephrased as this: when do we treat a composition as a single whole vs when do we treat it as a collection of parts? Is a list within a list just a longer list, or is it a list with a composite value? The answers, of course, depend on the language[3] but it's something which always needs specifying and which is always fraught with tradeoffs. In imperative and object-oriented languages this is also known as the problem of when should we use a pointer or reference to a composition vs when should we use the composition itself? In languages with pass-by-reference as well as pass-by-value the whole issue becomes even more hairy because we can offer the illusion of treating compositions as wholes while retaining the syntax of treating them by parts, forcing one to "just know" which functions have which behaviors.[4]

Aside from the question of when to treat a composition as a whole vs a collection of parts, there's another question which is striking. In dealing with the real algebras of destructive operations, parentheses have the meaning of affecting the order of operations, or being a serialized representation of a tree structure. Which means parentheses that don't change the order from it's parentheses-less default can be discarded, and since intermediate values are destroyed, once a parse tree is constructed all of them can be discarded (just as they're unneeded in prefix-only or postfix-only languages). With the abstract algebras of constructive operations, however, parentheses mean something entirely different: they represent the boundaries of the composition, they declare the semantic shift from where we should treat a composition as a collection of parts (that is, as being in the process of being constructed) to treating it as a unified whole (that is, as a single value to be used in other operations). Because they indicate a shift in meaning they can (almost) never be discarded.

And yet, no language I'm familiar with actually makes a distinction between constructive and destructive operations nor makes a distinction between the very different kinds of parentheses.

[1] For those unfamiliar with the term/datastructure, tuples are objects that contain a fixed number (and order) of values which may be of different types; somewhat like a vector. The fixed size is important and makes it different from lists which have variable and mutable size or arrays which have variable (though generally immutable) size (and both of which are typically limited to being collections of a single type of variable). That is, a 2-tuple or pair is an entirely different type than a 3-tuple or triple, unlike different length lists or arrays which can generally be treated the same. For those familiar with C-like languages, tuples are a lot like records or structs though their components don't have names, just positions.

[2] So for example, the natural numbers can be thought of as the value 0 and the value received from applying a successor function to other natural numbers. Hence succ(0) is analogous to "1", succ(succ(0)) is analogous to "2", etc. In this way they're considered isomorphic by mathematicians since we can find a mapping between the inductive/constructive/abstract representation and the normal/destructive representation. But you can start to see the difference if you think about how we would go about writing that successor function. For the abstract structure to hold, there don't exist bit patterns to represent the natural numbers as we're used to seeing them which we could return, instead the successor function (like the zero function) are just data constructors like when we call new to create an instance of an object.

[3] In Perl, it's just a longer list. (Yes, I mean "list" (or perhaps "sequence") and not "array". Perl does have both, though only the latter can be stored as a datatype. But if we're talking about Perl's list-like "arrays", then the answer becomes an array with composite values (stored as references).) In Haskell it's always a list of compositions, and all the scalar values the thing ultimately contains must be of the same monomorphic type. In our Smalltalk assignment above, it's both, depending on where it's at in the list.

[4] Which is not just an idle question in, for example again, Perl. Normally arrays are expanded into lists of their components for passing to functions, hence given a call like foo(@a, @b) the function cannot know where the first array ends and the second begins, it just gets a list of all the values together. But by using prototypes you can get pass-by-reference behavior and so suddenly you can do things like that, but they must be real arrays or hashes not references or (such as they exist) literals. There isn't the same problem in C/C++/Java because in those languages the name of an array is actually a pointer/reference and you never really have the array as a collection of individual components.

music of the moment: Narrow Your Eyes ~ They Might Be Giants (Apollo 18), She's Dead ~ Jim's Big Ego

categories: design, computer science

Sunday, 1 August 2006 : LDAP: authentication on linux

Today's victory: getting a linux client to authenticate users (i.e. let them log in)

This tutorial covers most of the nasty bits. It turns out that I was nearly there with my earlier attempts at setting up the client, I just needed to mess with /etc/pam.d/* stuff. That tutorial will only enable logging in via normal means such as using su. To use ssh you also need to add the following to /etc/pam.d/ssh:

auth       sufficient   /lib/security/pam_ldap.so
auth       required     /lib/security/pam_env.so
auth       required     /lib/security/pam_unix.so shadow nodelay
account    sufficient   /lib/security/pam_ldap.so
account    sufficient   /lib/security/pam_ldap.so
account    required     /lib/security/pam_unix.so
password   required     /lib/security/pam_cracklib.so
password   required     /lib/security/pam_unix.so shadow nullok use_authtok
session    required     /lib/security/pam_unix.so
		

Note that some tutorials say to use pam_pwdb.so instead of pam_unix.so, but pam_unix.so seems to obviate the other so far as I can tell. On Solaris it looks like that in turn has been obviated by other modules. If you experience problems with PAM, yet another log to take a look into is tail /var/log/auth.log .

categories: ldap

Sunday, 18 July 2006 : LDAP: linux clients

Today's task: screw Solaris, let's try to get a linux client running.

So far I'm not sure if linux boxen can use profiles, though surely there must be a way. Instead they use /etc/ldap/ldap.conf. If all else fails we can make a script to update that file from the profile. All you have to do is set the BASE and URI appropriately and ldapsearch works perfectly.

Second task is to set it up to use ldap instead of nis for automounting. Since my old ldifautomnt.pl script was using Solaris' nisMap and nisObject objectclasses I needed to rewrite the script to use the automountMap and automount objects linux expects. Alas, the OID for automount given in the schema file I mentioned earlier conflicts with that of memberNisNetgroup. Jury's still out to lunch on how to properly proceed, but for now I just made up a new OID in RedHat's space. This site offers a listing of OIDs, but I haven't been able to find where automount belongs.

categories: ldap

Sunday, 17 July 2006 : LDAP: client profiles

Today's travail: trying to set up an ldap client using profiles.

Instead of just killing smelly (the Solaris client test box), this time I tried setting up profiles to make it easier to kill him next time. Now, technically, using profiles isn't "required" but a quick read through ldapclient's manpage and through Sun's docs shows it to be preferred over manual configuration. Naturally, since it isn't technically required, Sun doesn't include the appropriate schemas to use it.

This page has some pretty good documentation on setting up a client on Solaris, and it also includes the schema for nisDomainObject in the LDIF format that SunONE understands. With that schema and the one for DUAConfigProfile mentioned last time (and in need of conversion from OpenLDAP to SunONE) you're armed to begin setting profiles up.

When you run ldapclient init -a profileName=profile-name, it looks for a nisDomainObject with a nisDomain attribute matching the current NIS domain[1]. Then it assumes an ou called profile as a child of that object and searches under there for a DUAConfigProfile with a cn attribute matching the profile you're trying to load[2]. Which means that you need to have that ou=profile underneath any nisDomainObjects you may have.

So now that you know how ldapclient works, this is when we run into another problem. We're using profiles so that we don't need to manually configure every client, but instead can use ldapclient to retrieve the client file from the ldap server for us. When running the command on Solaris 10, however, we run into another glitch: Solaris 10's Service Management Facility. When smf attempts to turn on and off the different services to initialize ldap it fails to start network/ldap/client:default. After looking through many a log file we eventually identified that the problem is:

ldap_cachemgr[5459]: [ID 293258 daemon.error] libsldap: Status: 0 Mesg: Configuration Error: No entry for 'NS_LDAP_BINDDN' found.
ldap_cachemgr[5458]: [ID 703877 daemon.error] ldap_cachemgr: failed (rc = 255).

I.e. there's no client file. Now, the whole point of running ldapclient init ... is to generate such a file! At this point I'm inclined to believe that this is a bug in Sun's code. Absolutely no diagnostics are presented for why the file might not be created, nor does any log file mention anything other than that the file fails to exist. I would be elated if someone could point out a solution to this problem. For further reference, trying ldapclient manual ... also fails with the exact same unreported error.

[1] Or (&(objectClass=nisDomainObject)(nisDomain=current-nis-domain)) if you prefer.

[2] According to the logs it actually looks under ou=profile,... for (&(|(objectClass=SolarisNamingProfile) (objectClass=DUAConfigProfile)) (cn=profile-name)) which seems to imply that a SolarisNamingProfile could be used as well. Since ldapclient genprofile ... creates a DUAConfigProfile however, I'm disinclined to tempt fate.

mood of the moment: disgruntled

categories: ldap

Sunday, 16 July 2006 : LDAP: Introductions, client configuration, schema loading, and autofs

So at work they have me switching over the network from NIS to LDAP for a network of a couple hundred machines on a mixed network of Solaris10 and linux (Ubuntu). Now, the basic idea of LDAP is pretty easy and there are a number of books that'll teach it to you like O'Reilly's take on the subject. Unfortunately once you get past the basic introduction to what it is, the documentation peters out.

Documentation is important. In fact that's the very First Law of Language Design: a language is only as good as its documentation. The documentation for LDAP is like the documentation for an extremely proprietary product intended only for use by governments and very large corporations. First of all there's very little of it. Secondly what documentation there is goes into excruciating and technical detail of very specific and often esoteric facets of the product, but makes no attempt whatsoever to answer basic questions nor mention how one is to actually use it. Thirdly, what brief discussions there are of how to do half of a basic task only discuss a different model (openldap) than the one we're using, which is of course incompatible with ours (SunONE DS5.2).

Back when I started this blog I started it with the hopes of collecting my thoughts and experiences as I navigate the world of IT and F/OSS in particular. While oft I've been defunct at doing so, I think for this project especially it's important to keep a log of my discoveries so that others in my position may learn from my tribulations. And now, onward to the first colossi to be slain.

...

In truth, the first colossus was just getting the thing up and running enough to use the gui console to look at the (now sparse) directory tree and to get the thing to accept input in LDIF format. I didn't keep much documentation of that but it should be relatively straightforward even though you'll doubtless encounter some stumbling block along your way. For importing LDIF the manpage for ldapmodify is a helpful one. This page from Sun's documentation has a little more, but I also suggest reading what the O'Reilly book has to say in order to get an understanding of LDIF format itself. This is a script I wrote for converting passwd files into LDIF.

We'll skip the second and as yet undefeated colossus for now and bound on to the third one mortally wounded just this past friday. This site has the bulk of what little documentation I've seen for setting up clients, this page from Sun has a tiny bit more; all other documentation only looks at the server end of things. There's a complication however. From a basic install the SunONE server doesn't understand the DUAConfigProfile object class that ldapclient generates and requires. Other than that you can just use ldapclient genprofile to output a profile, just save the output and import it; see ldapclient's manpage for more info.

In order to load new object classes first you need to get ahold of the schema for them. The schema for DUAConfigProfile is here (in openldap's schema format). Naturally the SunONE server doesn't understand OpenLDAP's schema files. This script takes in an openldap schema file and prints out a version Sun's server can understand. This page from Sun's documentation gives a brief overview of adding new schema in. The big hangup it overlooks is that the files must have an extension of .ldif. If you save the output from that script with a *.schema suffix, the server will ignore it. Other than those two tricks installing new schema is easy: find the schema you need, convert it to Sun's format with the script, save it to the ServerRoot/slapd-serverID/config/schema directory in the 9* range with an *.ldif suffix, and restart the server.

The second colossus is trying to set up autofs so that NFS partitions can be mounted automatically on need across your network. Last time I tried this I ended up hosing the box; I'll prolly hose another on monday when I try it again. Naturally, again, the automounter on linux is different from the one on Solaris. This page gives an overview on how to set things up for linux.

In short, for those who may be unfamiliar with autofs. There's an automounter file called auto.master which maps the directory containing mountpoints to a different kind of map; that second map maps the individual mountpoints to where they come from over nfs. In ldap you have a single automountMap object with a child automount object for every line of the file. Now we have the parent dirs point to ldap queries, each of those queries points to an ou which represents the second kind of map and has automount children for each line of the old file.

On Solaris the file is called auto_master instead (and linux can be configured to use underscore for compatibility) and instead of automountMap and automount, nisMap and nisObject are used instead, although I've heard (though I've mislaid the link) that as of Solaris10 the openldap/redhat objects should be the default. Naturally the schema for the openldap/redhat objects doesn't ship with SunONE, though they can be found in openldap format here with prerequisites here and here. Note that the core and cosine prereqs have some schema already declared by Sun's schemas. My Sun formatted versions with those duplicates commented out are here and here. Also I have some scripts I used for converting the old files and NIS maps here and here.

music of the moment: Dead Boy's Poem ~ Nightwish (Wishmaster)

categories: ldap

Sunday, 16 July 2006 : Eng: On Sanity

So here's another crazy idea I have for Eng. When writing code, particularly at a low level, there's frequently call for doing basic sanity checks. That is, you often see code like this if ($foo == nil) { omg die! } elsif ($foo->bar()) { blah blah; } scattered liberally around everywhere. And especially with method calls when you got your object from some potentially unreliable source you see if ($foo and $foo->bar()) all over. Wouldn't it be nice if we could abstract that away?

I've had a few different ideas in the general area of different semantics of "return values" — for example you have different senses of "return value" when you're talking about the evaluation of a function/method/operator, the error code for failed evaluation, and returning "this" object so you can chain method calls together — but I recently had an idea that might be easier to implement in a similar field: we could do sanity checking with a special postfix operator ?. You would use it like $foo?->bar() and it would have short-circuit semantics so that if $foo passes the sanity check it behaves like a no-op, but if $foo fails sanity checking then it will skip evaluating bar() because trying to evaluate it will probably explode, what with $foo not being sane and all. In case there are different types of unsanity it would also be helpful to have some special variable ($? perhaps) that stores the type of unsanity, or have perhaps some other way to deal with unsanity handling.

The challenge is where exactly to short-circuit to. If we have a statement that's a chain of method or field calls like $foo->bar()->baz->bif()->zorch()->quux; then it makes sense, wherever the ? is, to skip everything from there to the end of the chain/statement. But in other circumstances things get more complicated.

For instance, what do we do if it's in a conditional statement? If it's just a basic conditional statement like if ($foo?) or if ($foo?->bar) then it would make sense to have the whole statement return false. But if it's in a compound conditional statement like if ($foo?->bar() or $zorch->bazzle()) then it would make sense to skip the call to bar(), fail the first subexpression, and so evaluate the second. We could say that in boolean contexts the expression returns false, but that is contingent on what we mean by "expression" here.

Another example of complexity is how it interacts with non-method calls, such as arithmetic expressions. If we have $foo?->numeric() + 5 and $foo isn't sane, then what should the expression return? Well maybe the whole greater expression should fail (i.e. be skipped), that sounds sensible. Now what happens in non-algebraic expressions, like assignment? Should $foo = $bar?->baz() skip the assignment too, or should it set $foo to a known-bad value like nil? In case that question seems too straight forward, compare it against foo($bar?->baz(), $bif); should we call foo() with a known-bad value, or should we short-circuit out of ever calling foo()? Also, since ? is intended to simplify code, we must expect that callers won't necessarily test $? unless they really care about why ? failed.

A brief digression into implementation details. When calling ? we obviously need to pass it the thing we're testing. But at an assembly level, in order to know where to short-circuit to we also need to pass a label to jump back to. If during the course of evaluating sanity we determine the object's not sane, we can't just do a normal return because that wouldn't give us the short-circuit semantics we desire. The easiest way to get our short-circuiting would be to overwrite the memory location that stores the return address with the label we wish to jump to, and then when we return we'll return to somewhere else. In the event that some architectures disallow overwriting the return address, we'll have to manually pop a stack frame and do an unconditional jump, or use an inline function instead of a real function call, or devise some other method. If we allow operator overloading, overloading ? will have to be treated specially since we're not using a normal return.

Back to semantics. So far, I can identify six contexts that to be specified: method chains, boolean expressions, arithmetic expressions, string expressions, function calls, and assignment. And two general approaches: returning nil/false/zero/lambda or skipping evaluation. Since the whole point of this sanity checking operator is to avoid Doing Bad Things(tm) I'm thinking that we should err on the side of skipping evaluation, but there are certain places where just jumping to the next statement is jumping further than we may like. Inside conditional expressions — particularly disjunctions — is the big one, but also boolean expressions used as shorthand for control flow (like Perl's famous open($fh, ">", $file) or die "omg!";). Perhaps when the ? is buried in a boolean context it will skip the rest of evaluation for that whole subexpression and return false to the boolean operator, but in all other situations it just skips to the next statement. That sounds like a sensible enough place to start for Doing The Right Thing.

music of the moment: Maahinen Neito (Earth-Maiden) ~ Värttinä (Iki)

categories: eng, design

Sunday, 9 July 2006 : Χαιρε Ξενοβιη

So I got my new laptop about a week back now and've been meaning to mention that. I've gotten everything restored from before, though it was in need of organizing when I backed it up. At least nothing was lost. I was calling the new laptop Fuchikoma Mk II for a while, but much as I love the name it just didn't seem appropriate. And so, the newest addition to the household has been dubbed Xenobia. Everyone, say hello to Xenobia.

For the gearheads who haven't been following my other blogs, Xenobia is a 15.4" MacBook Pro with a 2.16GHz dual-core Intel Core Duo processor, 2x512MB 667MHz DDR2 SDRAM, a 100GB 7200rpm SATA/100 HDD, ATI Mobility Radeon X1600 with 256MB GDDR3 SDRAM, plus standard things like gigabit ethernet, 802.11g/b, and bluetooth.

Xenobia has a number of nifty hardware components, many of them the sort of thing I'm amazed took this long for anyone to think of. Apple's usually good for that. There's a camera above the lcd. Kinda cute, but it looks like there's only one program that can recognize its existence, and that one only treats it as a camera, not as a videocam. (Though I hear tale that iChat can recognize it for video conferencing.) But the cool features are that the keyboard is backlit, and that the brightness of the keyboard and the monitor can be set to automatically adjust to the level of ambient lighting. Also there's two-finger scrolling and two-finger click for the right-click (in addition to the good ol' control-click). I think I prefer the old mod that let you do two-finger scrolling with clockwise and anticlockwise gestures instead of the up/down version that's now in the os. Of course this new way allows for sideways scrolling too.

In the near future, possibly even later today, I'll be making a post about some things to sell. Xenobia came with an offer of a free iPod. I have no need of an iPod, and so I hope to sell it off to help finance the new laptop. Also, since nearly everything on Fuchikoma still works fine, I'm planning on selling him out for parts. There are a few other tech gadgets I'm looking to be rid of too, so stay tuned for the posting.

Sunday, 9 July 2006 : Dashboard hacking

So Apple's latest update has some questionable details apparently. It's claimed the messages sent home don't contain anything meaningful, though I haven't heard of anyone specifically verifying this.

The article provides one way of disabling the service, though I advise against using it. Modern OSes aren't meant to be hacked that way, a better way of disabling it is mentioned in the comments, namely: sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.dashboard.advisory.fetch.plist . Not only is this approach cleaner since it uses Apple's own tools, it brings up those tools which — while mostly undocumented — are very helpful for finding out how your system works to spot other potential concerns.

I noticed however that I couldn't find the Dashboard Advisory running on my system. This is probably because I've done another little hack to disable the Dashboard entirely. I can't seem to find the original article on doing this though this one explains in brief how and why you'd want to do it. The command for the curious is: defaults write com.apple.dashboard mcx-disabled -boolean YES && killall Dock .

These are just a couple of the little hacks I've made to my system. I thought any mac users out there might also be interested in them. And yes, both launchctl and defaults have manpages you can look through.

categories: os x

Sunday, 14 May 2006 : Resurrecting wisemen

So I finally got ol' man Bal reloaded and running again. After all these years he's up to a 2.6.15 kernel, Gentoo natch. Right now he's not configured to do very much, but there are a few things.

I have appletalk set up so I can mount filesystems to osx. Samba should be set up to do the same for windows, but I don't have a windows box to check if it works on. I also have daapd[1] and howl[2] set up for iTunes sharing, which is nice to have again. Other than that it's just the usual suspects: ssh, elinks[3], rsync, cvs, svn, screen, irssi[4], mutt,...

Other than what's currently set up, I'm not sure what all I want Bal to be used for. I need to set up some vpn stuff as part of a larger project to make the itunes server available in the 'combs at psu, but that's not going to happen for a long time yet. It'd be nice to set up some way to access him from the outside world so I'll prolly mess with that, and I'll need to get a good firewall set up when I do. Any other suggestions?

In other news it looks like I might have a job lined up for the summer doing tech stuff, and potentially another one as a research assistant. Which takes a load of from the financial worries of recent. Other than that I'm looking at spending some quality time on my projects this summer. Paperboy RSS is still in an interstitial place between v1.0 and v2.0. Once Paperboy 2.0 is released I'd like to get some work on Titania done, setting up the XS and such to link to the library. I'm also going to start working on writing the complier for Eng, try to see how much of that I can get done to see what the interesting parts will be to work on. Another small project that came to mind is it'd be nice to set up a script so I can just drop urls into a folder somewhere and let it harvest them to do all those link roundups I'm prone to. Of course before I do that I'll need to break the link roundups off into their own blog. And of course once I get Titania set up better I'll need to convert the site over from the current framework. It'd also be nice to spend some time adding features to the dtds and xslts I use, I've come up with a pretty sizable list of ones to add.

Well, that's the quick update. It's been too long since I've posted in this blog, hopefully that'll change soon. I have a few essays on Eng in the works, it's just a matter of finding the time to get them out of my head and into electrons mainly.

[1] Though the original daapd has now been obviated by a multithreaded version. And it looks like they've added in playlist support since last time I checked. And made it into the Portage tree, though the "stable" version is too old to work with OSX10.4 so I had to go with the masked version.

[2] Though it looks like Howl is no longer being developed by anyone. Hopefully someone'll step in to fill the gap and keep the linux itunes server alive.

[3] And have I mentioned how bad ass elinks is recently? You want this browser. Now if only I could get it to compile on osx. ::sigh::

[4] Which reminds me, I've done a lot of mods and such for irssi; I should put them online somewhere.

music of the moment: Golden ~ My Morning Jacket (It Still Moves)

Sunday, 3 April 2006 : Eng: On tensors: are lists and axes the same?

There's a conceptual problem that's been plaguing me with Eng. For a first writing on the language, this isn't the area I would normally choose. But the most logical starting points — e.g. lexical aliasing, code metastructure — I've a pretty good handle on. Those are topics I will delve into in the future, but for now, this is my problem.

One of the design considerations for Eng is to be able to provide crossplatform natively parallel code.[1] When I say this, one should bear in mind that this goal alone is as large or larger than all the other goals for the language combined. There's a specific facet to a specific part of my attempt to do this which is causing me troubles, but before that, some history.

Back in the 1980s parallelism was very much in vogue. Then, come the late-1980s/early-1990s when microprocessor speeds started bounding through the roof, everyone decided parallelism was stupid and abandoned most research on it. (Though of course, there are always a few diehards for any technology.) These days, as we're starting to reach the limits of current fabrication technologies, and with the shape of the curve between speed and cost and how it's evolved over time, the field of parallelism is starting to simmer again with a passion fit to boil over the coming decade.

To see this, just take a look at the current market: basic home computers are shipping with 2~4 processors, many of those are dual- or (will soon be) quad-core processors, or have hyperthreading. And if that home market doesn't convince you, then take a look at console game platforms like the xbox360 or the ps3 (or their previous counterparts). And this doesn't even get into the server market which is growing ever faster as computers become persistantly more ubiquitous tools for every endeavor and every business. Nor does it get into distributed systems like the microchip in your keyboard and mouse. Let alone the scientific markets for weather modelling, VLSI, and AI — i.e. those diehards who never left.

...

Now that you're convinced that a natively-parallel language is essential, the first part of my problem. There are two basic varieties of data in the universe: scalar data, and aggregate data; though the lines between them may not always be clear (cf. records or objects). We can clear that line up a bit if we discount records, or call them scalars, and only consider serialized aggregate data (e.g. arrays, matricies, etc). For this discussion we'll ignore the differences between different kinds of scalars, and just consider them unary widgets which can be put into serialized aggregate forms.

For reasons immediately apparent to any who've worked on SIMD systems, and perhaps even to those who've worked sufficiently on highly data-parallel programs on conventional SISD (aka non-parallel) systems, the normal presentation of arrays and other serialized aggregations is woefully simplistic and insufficient for dealing with massively data-parallel programs, and even for many other programs. Furthermore, that some languages have separate datatypes/syntaxes for first-order aggregations (arrays, vectors, etc) and second-order ones (matricies, 2D arrays, etc) is a kludge at best.

Luckily, there's a readily conceivable solution, though I don't know that anyone else has come up with it. The reasons built-in matrices, or the equally infamous arrays-of-arrays, and their ilk seem so kludgey is because as we look at them we can see that they're just an extended case of our basic dependable array. So we ask ourselves, why then do we treat them so differently? The ready solution is: we don't.

I've already started using some terminology which does this. In mathematics there's a concept called a tensor which is a generalization and abstraction of matrices and the like. A tensor is a multi-dimensional "thing" where the minimum number of dimensions needed to describe it is it's order (or rank). So when we're talking about arrays or vectors, we can also describe them as first-order tensors. And matrices are just second-order tensors. We can of course extend this upwards for 3rd-order tensors (cubes), 4th-order tensors (hypercubes), all the way up to infinity — though it becomes hard to visualize after four orders or so. We can also extend this downwards: a scalar, is simply a zeroth-order tensor.

In Eng, I propose that the built-in serialized datastructure is not an array nor a matrix, but rather that it is a tensor in all its abstraction. Which of course means, that we have no scalars, since scalars are just one extremum of a tensor. Now, maybe we don't want to call them tensors — maybe another name which doesn't bring to mind the horrors of linear algebra would be desirable — but the idea is the same, whatever you call it. For a fuller argument on the complexities of having tensors as your basic data type, what the benefits are, and why I've been motivated to head that direction, I will write another essay at some point.

Some more terminology is helpful for talking about tensors. The order of a tensor can also be described as the count of its axes. And every axis has a length or dimension describing how many elements it has along that axis.[2] The shape of a tensor is a full description of it's axes and each of their dimensions; this is helpful because we can have two kth-order tensors which still have a different shape (i.e. arrays can be of different lengths, matrices can be square or long or wide, etc). When performing operations on tensors their shapes must be compatible, not necessarily the same shape but shapes which "fit" together for the operation in question. For an example of different but compatible shapes, consider matrix multiplication which takes an AxB input and a BxA input.

...

Another topic I'll need to justify the importance of in a separate essay is, there's an irony in making a natively-parallel mid-level language. The idea behind MLLs is to minimize the abstraction in syntax (when compared to HLLs; they provide plenty of abstraction when compared to LLLs). And yet, say we have an array of elements that we want to perform some function on. With a conventional MLL we'd write a for-loop and iterate over all the elements. The problem is for parallel programs, the compiler has no way to look at that loop and determine that those operations could be done in parallel. And so, for parallelism, it's helpful to abstract the parallelism into the syntax.

In APL there's the notion of "verbs" and "adverbs"[3]. Verbs include functions, methods, and operators — they're things you do, operations to perform. Adverbs instead are things which affect how you perform those operations. I like the notion of adverbs because they allow for some very sophisticated code (cf. APL's generalized dot-product), and so I'd like to have them in some form or another.[4]

For an example, in most languages addition is a binary infix operator, so it takes arguments like a + b. If we want to get the summation of a list, we could use more than one addition as so: a + b +...+ z. Unfortunately this isn't always possible and seems over-redundant. In a language with adverbs we could use something like a "list-valence adverb" to modify the addition operator so that it takes a list of arguments instead of just two. So we could change the example to something like: +\ a, b, ..., z.[5]

Another nice thing that this sort of list-valence adverb provides is an abstraction for parallelism. It's easy for the compiler to look at one operation taking a large number of arguments and to decide that this should be done in parallel. With many individual calls where associativity and precedence and the like come into play, it's much harder.

...

An aside so the quandry presented later on will make sense. Now that we have these tensors, we need some way to extract elements from them to perform actions on. We'd have to be able to pull out individual elements, as a trivial requirement. But it'd also be nice to be able to pull out "slices" of data as well. For example, say we have a 3x3x3 cube of elements (that is, a third-order tensor with all axes having a dimension of 3). Perhaps this represents the contents of a room. Now, say we want to find out something about the room based on height, e.g. to discover that there are more objects in the lower portion of the room than the upper. It'd be helpful to be able to slice that cube up by its height into a number of "stacked plates"[6] so that we can find the number of objects in each "plate" to compare them. Naturally, each "plate" can be manipulated in parallel so it'd be nice to return a list of them to operate on at once, rather than needing a loop to iterate over them. Additionally, it'd be nice not to need to specify anything about any of the other dimensions than the one we're slicing along, perhaps trivial for our cube, but what if we were working with a 256th-order tensor?

Here's some working syntax I've been using, though like any working syntax the exact symbols could be changed to something better later on. So first we have the tensor in it's entirety, which could be referred to with A. To specify things about the dimensions of the tensor we use a suffix like [pos:axis] where pos is the position along the axis, and axis specifies which axis we're referring to. For example, we could get a corner element from A by typing A[0:0][0:1][0:2]. Since most tensors will be of low order and their lowest axes will be accessed the most frequently, the :axis portion can be optional with the implicit dimension being 0 for the first one, then 1 for the second, 2 for the third, etc; thus simplifying that corner element to A[0][0][0].[7]

So far so good. Now, what about our example of wanting to get a stack of plates from a cube (that is, a list of second-order tensors corresponding to the third-order tensor). Well, what if we use the normal syntax for specifying details about an axis, but leave off the position, as in A[:0]? That seems intuitive enough. Now, if we want just one of those slices instead of all of them we could use A[0] (or A[0:0]). Note the difference between this and the fully-qualified A[0][0][0]; the first returns a 2nd-order tensor, whereas the latter returns a 0th-order tensor; we could also do A[0][0] to return a 1st-order tensor. And if we wanted more than one slice but not all of them, we could use A[0,1] (or A[0,1:0]) to return the first and the second (but discard the third).

For convenience of programming, we can also take a note from Perl. In Perl, since the length of arrays is known, we can use negative indices to count from the end of a dimension of a tensor. So -1 would be the last element, -2 the next to last, etc. For statically allocated tensors this ability is trivial to add, but even for dynamically allocated ones it should not be too difficult to implement.

...

Enter my quandry. So we have this list-valence adverb, \, which changes our binary infix operator for addition, + and turns it into an n-ary prefix operator for the sum of a list, +\.[8] And we have this notion of slices which returns a list of lower order tensors. So, if we had an array of numbers and wanted their sum, we could do something like +\ A[:0]. Or if we had a 2nd-order tensor and wanted to get a 1st-order tensor with the sums of the columns, we could do +\ A[:1] with the addition operator performing "straight-through" addition. (More on the issue of dot-products and tensor-products in a later essay.)

Now, at a first glance, a list isn't anything different than an array, and so shouldn't be any different than an axis in a tensor. Conceptually that's a very elegant idea, and in my minimalistic mindset I like it. However, there are some problems it raises. We introduced the list-valence adverb to simplify getting the sum (or product, or...) of a number of elements, both to improve parallelism and optimization, but also for easing clarity and code maintenance. So what if we wanted to do that column sum for a large number of tensors? We can't just do +\ A[:1], B[:1],... since that would give us the sum of the whole list (that is, lists are flat). We could use a sequence of +\ A[:1], +\ B[:1], etc or put it in a loop, but that's disgustingly kludgey. Maybe we could introduce a new slicewise adverb so we could do +:1 A, B,... to return a list of the 1st-order tensors.

But if a list is the same thing as a tensor's dimension, then the list-valence adverb is really just turns a binary operation which takes kth-order tensors into a unary operator which takes a single (k+1)th-order tensor with the new axis having a dimension equal to the length of the list. If that's the case, then the list-valence adverb is nearly identical to this new adverb. But that also throws off which number we use to refer to which axis, and what number do we use for the list axis? Zero and positive numbers are taken for indices, and the negatives are used for the convenient count-from-the-end indices so we can't use them either.

They both look like reduction adverbs. List-valence takes a list of kth-order tensors (or a (k+1)th-order tensor), and returns a single kth-order tensor. This new slicewise adverb takes a list of kth-order tensors (or one (k+1)th-order), and returns a list of (k-1)th-order tensors (or a kth-order tensor adding a new axis with dimenstion equal to the length of the input list, and removing the axis specified[9]). Whether we consider a list the same as an axis or not, there's a subtle distinction there, and I'm not sure whether it's relevant or not.

Of course, if we consider a list and an axis the same, then there's also potential for confusion with the list-valence adverb. What does +\ A mean? Is it the straight-through summation of a list of tensors (of which there happens to only be one and so the output is A), or does it slice A by it's first axis? This confusion is largely predicated on whether we consider the commata delimiting a list to be a constructor for this new larger tensor or not.

And then there's always the issue that in a tensor, the dimension of an axis is the same regardless of the position on other axes, i.e. any way we slice a tensor we'll get a list of lower order tensors each of which has the same shape (though shapes may differ between different lists). While this is handy for many purposes it's inconvenient for others, such as times when we may need a list of tensors with different shapes.

I suppose, in the end, it's probably not an issue where there's a "right" and a "wrong" answer. But rather I'll just have to pick one or the other — either a list is an axis, or it is not — and stick to it. Having talked through the complications raised by the question, namely the last one about lists of different shaped tensors, I'm thinking it's probably best to treat them as separate. But just as we have the slicing mechanism to convert an axis to a list, we should have some construction mechanism for turning a list into an axis.

[1] Specifically for "typical" home machines and servers rather than scientific machines, but good solutions to the former will doubtless lead to good solutions for the latter, even if they are not the same, as scientific computers are often a "simpler" form of parallelism than home machines, though there's much more of it.

[2] Of course it's possible to think of continuous tensors rather than discrete ones, in which case the axis' dimension would be a description of the range of values that exist along that axis. And we can have non-numerical or unordered axes as well, axes whose positions are names rather than integers.

[3] That's J terminology. In APL they're called "functions" and "operators" respectively, an unfortunate choice of words since it confuses those familiar with the conventional definitions, and doesn't distinguish user-defined functions like foo() and built-in 'functions' like +. Because of this confusion, J which is a modern successor to APL changed the jargon, and I'll use theirs to minimize the confusion.

[4] Aficionados of functional programming languages may point out that these sound simply like higher-order functions. And, I suppose, in a sense they are, and that may be a good way to implement the functionality. However I think that just as it's helpful to distinguish user-defined functions and operators, it's helpful to distinguish user-defined higher-order functions and adverbs.

[5] We could also just change the addition operator to be a prefix operator which takes a list, as it is in Lisp. However, there's something about the infix form which seems more intuitive and natural, and so it'd be nice to keep it around if possible.

[6] That is — depending on how we have the tensor organized — slicing along the first axis. We could also slice along the other axes as well. Since the original tensor was a cube, slicing along any axis will create a list of the same length comprised of 2nd-order tensors of the same shape (square). However, as we visualize this, it's apparent that there's a subtle difference between the items in these lists — namely, their orientation. Capturing this will require that we conceptualize all kth-order tensors as existing in an infinite-order universe within which they can be rotated. Yet again, the details of this are fodder for another essay.

[7] This simple explanation could also be extrapolated. So say we have that 256th-order tensor and we want to specify the 70th through 80th axes. We could use something like A[0:70][0]...[0], where the lack of :axis means use the previous axis plus one, or use the 0th axis if there are no previously specified axes.

[8] We could do similar things with -\, *\, or even some user-defined function foo\.

[9] Or possibly it returns a list of kth-order tensors or a single (k+1)th-order tensor, with the specified axis having a dimension of 1, to preserve sanity of axis numbering.

music of the moment: All the Way Down ~ Voltaire (The Devil's Bris)

categories: eng, design

Sunday, 29 December 2005 : Z+WiFi, mk. III

So the Zonet ZCF1100 card finally arrived and my woes continue :) It's not too bad though. This card is known to work (a) on the hardware and (b) under linux. The problem is more that it requires the hostap driver and related software which, while available with the OpenZaurus and Cacko ROMs, isn't available under the standard Sharp ROM, which I'm using. Which means I'll have to compile it myself. Again, not too much of a problem, but it looks like some vital(?) header files are missing for making the kernel module. Which in turn means that I'll have to do some research and hack it out. Oh, and it looks like the striped down version of make I have installed on elys stripped down a bit too far and chokes on part of the makefile (and oh what a twisted makefile it is!). So I'll get to play around with my crosscompiler while I'm at it. Sigh. I get my self into these things don't I?

Some other helpful links: Mini-howto on Flashing Intersil Prism Chipsets, Wireless LAN resources for Linux, OESF forum discussion.

categories: zaurus

Sunday, 29 December 2005 : Strange bugs

So I found an interesting bug in the standard software that ships with Sharp's 1.01jp ROM. Namely su doesn't reset the $USER variable appropriately. It's worth noting that su - does properly reset it however. As does, oh, su wren. I'm not sure who to report this to. It looks like su is part of the TinyLogin multibinary (v1.4 2004.04.07-09:08+0000) which is akin to BusyBox, so that may give a clue who to contact. Unfortunately their bug tracker is down and the mailing list auto-bounced me. May have to sign up for the list afterall... Of course, I should install the latest Sharp ROM to see if it's already been fixed first. Of course that'll take an SD or MMC card which means it may be a while yet...

categories: zaurus