27 November 25

Epistemological Debt

There is a concept in software engineering called technical debt. Basically, this is something that accrues when taking shortcuts in building a software system. You solve the problem that is immediately at hand, but in so doing you neglect to think through edge cases and these come back to haunt you as use of the system expands and additional pieces get built out.

I think something analogous though more sinister occurs when working with large language models (e.g. ChatGPT and its rivals) which are the core of the AI boom. I’m calling this “epistemological debt”. LLMs are by design extremely good at returning plausibly sounding text, outputs that look correct on first glance but often contain some inaccuracies. For example, using AI-based transcripts and summaries of meetings are now commonplace: this is a standard function in Zoom these days. But what happens when the summaries get saved and become the official record of the meeting without anybody checking the generated text for the inaccuracies? Given that people are only getting more and more busy one suspects these failures to review happen all the time. The inaccuracies start to accumulate, and nobody can figure out what is truth and what is not.

Posted by at 07:46 PM in Technology | Link |

25 November 25

On Painting and Thought

A water-soluble crayon painting of a Sugar Bee apple, red with a streaked and spotted yellow underlayer. I’m continuing to explore sketching with my new Neocolor II crayons, and here is a painting I did today of one of the Sugar Bee apples from today’s grocery shop run. I’m starting to learn how the Neocolors work as their own distinct medium. They go on the paper very smoothly — it’s a wax crayon — and it’s easy to spread the pigments around with a wet paintbrush. Once the paper is dry again, you can draw on it with more crayon in another layer. I also picked up a trick from a video about drawing birds with Neocolor IIs. The artist in this video uses a plastic palette with a rough surface. After drawing on the rough surface with a crayon, one can pick up the pigment directly with a wet paintbrush, thus turning the crayon into what is effectively watercolor paint. This can solve some problems posed by only using the crayons directly on the paper, such as being able to create a smooth wash, or being able to paint details with a fine brush. I made up an instant rough palette surface by using steel wool on a yogurt container top, and tested this approach out.

It is interesting that learning how materials behave — in this case a new art medium — is as far as I can intuit is the domain of non-linguistic thought. When I wanted to add yellow spotting on top of the red of the apple, I just knew that my little rounded flat travel brush would be a good tool for this. I don’t believe language had anything to do with this thought.

This is a consequential realization because the trillions that are being invested right now in AI are being built for the most part on the manipulation of language. To the best of my knowledge the heart of today’s AI boom is large language models (LLMs). There was a piece published in The Verge today about how this is likely a philosophical error. The article is entitled “Large language mistake” and has the subheading “Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it.” The article draws upon a perspective piece published in Nature last year entitled “Language is primarily a tool for communication rather than thought”, arguing its case from contemporary neuroscience and linguistics. I’m not expecting AIs to know how to paint watercolor anytime soon.

Posted by at 07:43 PM in Design Arts | Link |

13 November 25

Transcribing Catalan With My New Workstation

Thanks to the Easy Languages folks, I learned the power of target language subtitling of video content in language learning, and this has been a big part of my Catalan studies. The Easy Languages approach is to do double subtitling e.g. for Catalan this is subtitles in both Catalan and English. But it is also very helpful to watch videos that are singly subtitled in the target language, e.g. Catalan subtitles for Catalan video, and I have watching these where I can find them. The YouTube channel Català al Natural does this specifically for language learning, and as I’ve described earlier I have watched many episodes of the TV series El Foraster this way.

But most of the Catalan content on YouTube has no subtitling available, which limits its utility to a beginner in the language. What to do? I came up with a plan for adding automated subtitling to the video content, and tried this out yesterday with much success. The workflow is as follows: a) download the YouTube video to my workstation b) run speech-to-text software over the audio channel of the downloaded video and c) add the transcribed text as subtitles as one watches the video stored locally.

This approach came together very easily using my new workstation. The details are as follows. First, I used the program yt-dlp to download the video from YouTube. The next step is the speech-to-text conversion. I used Whisper here, which I believe is the best open source speech-to-text converter, at least that is what I gathered from working with the AI institute a year-and-a-half ago. This is software from the belly of the AI beast, coming from the company OpenAI. It is multilingual, and Catalan is one of the better performing languages in the software. The output from this program consists of transcribed text with timestamps. Finally, I watched the video in the program Celluloid, which turns out to be smart enough to take the text-with-timestamps and overlay the text on the video as subtitles at the right times.

It greatly helps the accuracy of the transcription not to have to do it in real time, as the software can take advantage of looking at the language context around the current timepoint to produce a better transcription. My new workstation is very helpful here, having a graphics card with 12 GB of VRAM memory. It still takes a while: it was transcribing at a rate of about 4x real speed (that is, a 12 minute video was taking about 3 minutes to transcribe). The output seems very good, though as a beginner in the language I am not the best one to judge.

I tested this system today with a couple of recent videos from VilaWeb, and was pleased with how it helped. I might try experimenting with double subtitling a la Easy Languages, since I think that is supported by the video playback software after some fiddling.

Posted by at 06:58 PM in Books and Language | Link |

11 November 25

Patterns of Liberation

About four years I was doing some literature research on information and communications technology for sustainable development and came across the writings of Douglas Schuler, a computer scientist now retired from Evergreen State College in Washington, who works on democratic technology. He is most noted for the 2008 book Liberating Voices: A Pattern Language for Communication Revolution, published by MIT Press. I revisited this book today and it seems a good work to share in our present moment. As the title suggests, it is inspired by the highly influential 1977 book A Pattern Language: Towns, Buildings, Construction by architect Christopher Alexander. Liberating Voices takes a similar approach to the latter book and provides a catalogs of patterns helpful for positive social change.

The physical book for Liberating Voices seems hard to find but much of the content is replicated in the website the Public Sphere Project. In particular, there is a section specifically on the Liberating Voices pattern language. Several examples of these patterns include Linguistic Diversity, Participatory Design, Intermediate Technology, and Voices of the Unheard. There are 136 patterns listed in the original Liberating Voices publication and these are summarized in a set of cards here. Patterns which others have submitted are also listed here.

It looks like the Public Sphere Project has gone dormant for now but many of the patterns described there for social change are timeless, and it is well worth reviewing the set for ideas on how to act.

Posted by at 05:25 PM in Politics | Link |

8 September 25

The Jujube AI

A photo showing a 3x3 chessboard, a set of 24 small boxes with move diagrams labeled on top, and a blue plastic case with some colored beads in it. When I was a child, my mother and I crafted an AI out of a set of matchboxes and some jujubes (the small colored gummy candies). I enjoyed reading Scientific American when I was young, and at one point found an intriguing article by Martin Gardner, the columnist who wrote Mathematical Games for 25 years. This article was entitled A Matchbox Game-Learning Machine and was published in 1962 — I probably ran across it in a reprinted book collection.

Gardner’s article shows how to build an analog learning machine to play a simple game called hexapawn which involves moving pawns on a 3×3 chessboard. The rules are given in the article linked above. At right is a photo from a much more recent Instructables article showing the setup. In the game the human player moves first as white. The machine plays the black side. The system works as follows. On top of the boxes are diagrams illustrating all the possible states of the game after moves 2, 4, and 6 (the game can last no more than 7 moves) and the possible moves for black illustrated in different colored arrows. Inside each box are colored beads (or jujubes in my case) corresponding to the colored arrows on top. After the human moves, they find the box corresponding to the state of the game, randomly draw a colored bead on the top, and have black carry out the move indicated on the top by the corresponding arrow. The bead is set aside, and if black loses the game, the bead representing the final move is discarded from the machine. That way the machine learns that the final move is an incorrect one to take.

It turns out that given optimal play, black is guaranteed to win, and it doesn’t take very many rounds for the machine to become invincible — somewhere around 30 or 40 games played.

Does this qualify as AI? Absolutely. This demonstrates that machine learning doesn’t require digital computers. Admittedly, this approach doesn’t scale very well: Gardner’s hexapawn example with 24 matchboxes was based on an earlier system for tic-tac-toe that needs over 300 matchboxes.

I am interested in other examples of analog AI. In particular, contemporary board games often have subsystems for solitaire play that can be quite hard to beat. These generally do not learn from experience, but they do respond to current states of the game by setting goals and carrying out actions.

Posted by at 01:00 PM in Technology | Link |

29 August 25

Technically Sweet

When you see something that is technically sweet, you go ahead and do it and you argue about what to do about it only after you have had your technical success. That is the way it was with the atomic bomb.
J. Robert Oppenheimer

I am not the first to link this Oppenheimer quote to recent developments in AI but it seems quite apt. It is striking how quickly this era of generative AI has come about. The landmark paper presenting the theoretical architecture (Attention Is All You Need) behind large language models (i.e. ChatGPT and its relatives) was published in 2017. ChatGPT itself was released in November 2022, scaling up in complexity from the prototype model presented in the Attention paper by a factor of about 800.

The arrival of generative AI for images and video is another case of rapid evolution, well presented in a Stephen Welch YouTube video on the theory behind these technologies. Today there are numerous systems for generating video from text descriptions, but it took several mathematical breakthroughs in the past five years to get to these. For instance in February 2021 research was published describing a training method for placing images and their text descriptions in the same high-dimensional numerical space, but that was just the initial step in image generation, let alone video creation.

But the model built for the 2021 research was trained on 400 million pairs of images with corresponding text, scraped from we don’t know where. This week one of the big AI companies, Anthropic, settled out of court a major copyright class action lawsuit concerning the company’s use of millions of pirated books. Also this week, a wrongful death lawsuit was filed against the company OpenAI detailing how ChatGPT coached a teenager in committing suicide. Meanwhile, it has become clear that large language model-driven systems have security flaws that one can drive proverbial trucks through. And it has become incredibly easy to use text-to-image AI systems to create fake photographs for propaganda purposes. Pursuing the technically sweet has gotten well ahead of ethics. Again.

Posted by at 04:44 PM in Technology | Link |

21 August 25

The Domestic LLM

One of the reasons I built my own computer recently was to have a machine available considerably more powerful than my laptop so that I can learn about and experiment with current technology. I am now playing around with large language models (aka LLMs) which is the key technology behind ChatGPT and its rivals. As widely recognized, these state-of-the-art systems consume enormous amounts of resources to build and keep running. What’s less widely known is that smaller versions of these same models are continually being released as freely available downloads for community experimentation, research, and development. Many of these open models are still far too large to run at home, but many others will run happily on ordinary home computers (albeit the more powerful your graphics card is, the better off you are). I’ve been learning a great deal experimenting with these. Some of the things I’ve learned are:

  • There is an enormous amount of development going on across this whole space. There are tools and approaches available now that would have been really useful to me professionally a year-and-a-half ago.
  • Nobody really understands how this technology works. An example: I asked my local LLM to write a poem in iambic tetrameter about the cat sleeping on his cat bed in my office. After some nudging (the first version was in iambic pentameter, but I told it to try again), it succeeded in producing some doggerel with the correct scansion. How did the system do this? We have no idea. We cannot point to a “metrical poetry” module within the system — rather, we are seeing emergent behavior.
  • It is straightforward to set up an LLM system (even one at home) to let you conversationally ask questions and get natural language responses about a body of documents. (My test corpus has been a set of 80 or so conservation management plans from California). What is not at all straightforward is getting responses that are reliably accurate. Enormous amounts of engineering effort across every domain is being expended right now to build reliable conversational systems, but for now this is both very challenging and expensive.
  • It is not clear what the important real-world applications of local LLM systems (i.e. ones you can run on a laptop or desktop) are going to be. There are a great deal of privacy benefits to them, since you can avoid shipping your sensitive documents off to Meta/Google/OpenAI etc. for LLM-based analyses, but will the local systems be powerful enough to conduct the analyses? One application of much interest to me is extracting structured information from unstructured or semi-structured text documents. This has been a challenge I’ve been pondering for quite a number of years, and LLM-based techniques for doing this are just starting to emerge.
Posted by at 12:15 PM in Technology | Link |

11 August 25

An AI Lesson From Urban Forest Mapping

Today I fielded an email from a staffer at the California Air Resources Board about the following topic, and I think there’s a general lesson to be had here. Between 2021 and 2023 I worked on a project that was looking at the extent of and ecosystem services provided by the urban forests of California. This was a follow-on to an earlier project our lab had done in 2015 about the same topic, and one of the goals of the project was to do a change analysis between the two time periods. For the question about urban forest canopy extent, we were working with high-resolution tree canopy cover datasets from a company called EarthDefine. In particular, we were comparing a canopy cover data layer from 2012 (used in our 2015 analysis) to a canopy cover data layer from 2018. In theory, all one has to do to measure in canopy cover extent is to subtract the 2018 layer from the 2012 layer. Pixels where there was canopy cover in 2012 but not in 2018, or vice-versa, would represent change.

In practice, we soon discovered this wasn’t going to work at all. These canopy cover datasets were developed using machine learning models applied over NAIP imagery, which is high-resolution aerial photography produced periodically in a program run by the US Department of Agriculture. When we compared the canopy cover maps in 2012 and 2018 with their source imagery, it was evident that the machine learning models for two canopy cover datasets used very different ideas about how to recognize and delineate trees in the source imagery. This resulted in unrealistic change statistics, for example the urban canopy cover in Riverside County purportedly increasing from 2012 to 2018 by 20%. Basically, the comparison was between outputs from different machine learning models applied over different datasets (in particular the 2012 imagery had a resolution of 1 meter, and the 2018 imagery had a resolution of 0.6 meters) — apples and oranges.

The general lesson for AI is to be very careful about extending an AI model beyond the domain over which it has been trained. Sometimes this works, but many times it does not, with deleterious consequences. In particular, this is one of the antipatterns that can result in AI bias.

Posted by at 04:35 PM in Technology | Link |

7 August 25

The Broken Arrow Up North

This is a follow-up to my last post about the B-29 crash near Fairfield in 1950. Today falls within those several days in August when we think about Hiroshima and Nagasaki, so here’s a tale from the Cold War about the other Broken Arrow incident to happen nearby.

On 14 March 1961 a B-52F bomber carrying two Mark 39 Mod 2 thermonuclear weapons, each with a yield of 3.8 megatons, crashed in Sutter County about fifteen miles west of Yuba City and 40 miles north of our location in Davis. The plane had taken off 22 hours previously from Mather Air Force Base just east of Sacramento. (This base closed in 1993 and the field is now a general aviation airport). The plane was flying on an Operation Chrome Dome mission doing circuits over the Aleutian Islands. Chrome Dome was a Cold War mission between 1961 and 1968 where bombers would fly on alert armed with thermonuclear weapons in a position to attack targets in the Soviet Union if they got the call, with some portion of the nuclear bomber force airborne 24 hours a day.

The mission of this particular B-52F did not go well from the start. About twenty minutes into the flight very hot air started bleeding into the cockpit and the crew was unable to sort out the problem. Temperatures within the cabin grew to between 125–160 °F, and the crew took turns going to the deck below the cockpit to escape the heat. In contact with the Mather command post, the crew received instructions to continue with the mission as long as possible. Fourteen hours into the flight the pilot’s window cracked with the heat, depressurizing the aircraft. The crew decided to descend to 12,000 feet following the depressurization.

At this point the crew was exhausted and dehydrated and started making many mistakes. One of these errors was miscalculating the fuel burn rate, which was higher than normal because of the lower altitude, and a stuck fuel gauge didn’t help with the perception of the problem. Eventually they alerted Mather of their need for an air tanker, but they ran out of fuel about 2 1/2 miles before the rendezvous with the tanker. With the plane doomed to crash at that point, all crew members were able to bail out successfully, with the pilot steering the plane at the last minute toward a fallow rice field. The only fatality in the incident occurred on the ground when a fire truck responding from Beale Air Force Base overturned. The two hydrogen bombs aboard the plane were severely damaged in the crash but the high explosives they contained did not detonate and no radioactive materials were released.

In hindsight the crew should have aborted the mission when the cockpit temperatures grew unbearable. But this was the height of the Cold War, and Strategic Air Command was pushing their wing commands very hard to keep the early airborne alert program operational at all times.

Posted by at 02:47 PM in Technology | Link |

5 August 25

The Broken Arrow Down The Road

A Broken Arrow incident is, in United States military terminology, an accidental event involving nuclear weapons or components that does not create a risk of nuclear war. Today is the 75th anniversary of a Broken Arrow event that happened in Solano County less than 25 miles from here.

On 5 August 1950 a Boeing B-29 bomber was leaving Fairfield-Suisun Air Force Base bound for Guam when it crashed shortly after takeoff. It was carrying a Mark 4 nuclear bomb and was part of a contingent of 10 nuclear-capable B-29s being sent to Guam to serve as a deterrent to the People’s Republic of China at the start of the Korean War. The bomb in this B-29 did not have its fissile core installed so there was no risk of a nuclear explosion, but the high explosives in the bomb could and did explode in the fire subsequent to the crash. 12 of the 20 crew and passengers on the plane died in the event, as well as 7 people on the ground in the explosion which spread wreckage over about 2 square miles.

One of the passengers killed in the crash was Brigadier General Robert F. Travis, who at the time was commanding the 5th and 9th Strategic Reconnaissance Wings. To honor him, the base was renamed Travis Air Force Base in 1951.

I learned about this anniversary by seeing a reference to it on my Cat Lovers Against The Bomb calendar. I very much like the version of this calendar (the “classic” version) with black-and-white photographs of cats and have been using it as my office calendar for many years now.

Posted by at 03:33 PM in Technology | Link |

Previous