Simon has been involved in software development since the days of paper tape. He has developed niche software for information management.
The leading desktop, web, and mobile apps for captioning as evaluated in this review are listed below. More detail of these and other apps follow. To locate them, type Ctrl-F and enter the Application name.
Windows Desktop Applications
Application Rating Notes Caption Pro 4 Stars Converts prints, editable captions Photo Gallery 3 Stars Photo editor,unsupported Photo Caption Creator 3.5 Stars Unsupported Picasa 3 Stars Photo editor,unsupported AutoSplitter 3 Stars Converts and captions prints JpgStory 2.5 Stars Multi-line captions
Mac Desktop Applications
Non-Native Mobile Apps
Application Rating Notes JpgStory Little 3 Stars iPhone only,preserves pixels JpgStory 3 Stars iPad only, preserves pixels Typorama 3.5 Stars iOS only Mematic 4.5 Stars iOS, Android, preserves pixels Phonto 3.5 Stars iOS and Android PicScanner Gold 4 Stars iOS only, preserves pixels PhotoMyne 4.5 Stars iOS and Android, preserves pixels
Why Add Text to Photos?
Taking photos has never been easier. Some estimates place the total number of photographs in the world at about 100 billion. Their content has meaning for the people who took them, and maybe the people who appear them. For anyone else, a few words of context adds enormously to their value to other people. In the paper era, they were often added on the back, or in an album.
Digital photos have a huge capacity for storing data within their file structure, but this is mostly used for recording automatically captured data such as camera and exposure parameters, date and time. Geo-tagging using recorded latitude and longitude from GPS data is frequently added by mobile phone cameras.
The Four W's of Journalism
However, what people most want to know about photographs are the four W's of journalism: who, what, where and when. Computer power can be applied to answering all of these questions.
'When' is easily supplied, relying only on internal clocks in the camera. They may become confused about time zones, but an accuracy of a day or so is all most people want.
'Who' is performed increasingly well by automated face recognition, once some examples have been provided. Without examples, faces tend to be recognized as celebrities.
'What' is a question that automated image analysis (or auto-captioning) struggles with. It often comes up with accurate but uninformative descriptions as shown below.
'Where' is provided by turning latitude and longitude into a named location using a gazetteer database. Mobile phones do a pretty good job in well-populated areas, but off the beaten track, results may not be satisfactory. Digital cameras do not routinely have built-in GPS location tracking.
Although technology is making inroads into automatically adding the kind of information humans want to photos, it has a long way to go, and adding text manually looks like being necessary for many years yet.
The Progression of Broadsheets and Posters to Memes
With social media came the meme, where the image resonates with the text rather than the text describing the image. Some memes are the electronic successors to the broadsheets and posters that have been used to influence public opinion for centuries.
However, the near-zero marginal cost of electronic production and distribution means that many more people now create them, and humorous/philosophical memes now probably outnumber the ones seeking to influence people.
Another class of computer application assists with the creation of digital photo albums from existing hard-copy albums, which many people own, usually containing photos of family members. These applications have the capability to create a number of separate digital images from a scan or photograph of a number of paper prints, such as those appearing in a photo album.
As well as improving the quality of original images in the manner of photo editors, the applications can add captions (and sometimes metadata) to each of the digital images. The caption is added below the original image so that it does not obscure any of the original image pixels. These capabilities are of particular interest to people interested in genealogy.
What Software Should I Use to Add Text to My Photos?
When computers were less powerful and graphical user interfaces were a novelty, information about images was often easily visible in the file browsers (such as Windows Explorer). In that environment, adding information to a file name or placing the file in a folder with an informative name was what most people did.
Nowadays, applications dominate the operating system. File names and folders are not readily accessible to image viewing applications, especially on mobile devices. If it’s not in the image pixels, users won’t see it. This gives new importance to embedding information into images.
Social Media Captioning Is Limited
Most social media platforms offer image captioning, but the captions are placed on a web page containing the image and are only visible using that platform. If you download a single image from a social media platform, the caption (or any other metadata added by the platform) does not come with it, although if you download all your images, some metadata may be included in the download.
This review looks at some of the leading software products for a range of image captioning tasks that you might conduct on a desktop, mobile device or using a Web application. These include adding names of people or places to photos you've taken yourself or creating a meme to reach as many people as possible. Different tasks need different software.
A major feature of social media is the ability to upload images and videos and text relating to them. The image and text are then displayed on a web page such as that shown below.
This facility is widely used, but viewing of the image and caption is restricted to the Facebook application. If the photo and video data are downloaded, images have reduced pixel dimensions and arbitrary names. Caption data such as that shown in the red box above is not included in the download.
Google Photos is a very popular web-based photo editing, sharing and tagging application with impressive image analysis capabilities that make it possible to search images for a range of entities, and to group photos according to who appears in them using facial recognition.
Multi-line captions can be added to images stored in Google Photos via the Info box as shown below:
However, if the Info Box is not shown, the multiple lines are concatenated at the bottom left of the image as shown below:
Desktop Captioning Applications
Almost all image editing applications on desktops, phones, and tablets provide the capability to add text anywhere on top of an image, and many will let you add a blank region to the image in which you can place your caption. The major difficulty with using such applications for captioning is complexity: users want to type their captions, not learn how to use an image editor.
A further difficulty is choosing a color for the text that stands out from the photo background. The same color may not work in all the photos you write on. If you make a mistake (such as mis-spelling a sports team member's name) you have to re-do the whole caption. And for historic family photos, you may not want to write over any of the existing pixels.
Fortunately, there are a few dedicated desktop captioning applications that address these difficulties, but support for them may not be available.
1. Windows Photos
The current native Windows photo management application is Photos. This is only available for Windows 10 and replaces the venerable Paint. Text can be added to the existing image via the Paint3D application, but with the limitations discussed above.
However, Photos has impressive automatic image analysis, presumably in order to compete with Google Photos. It supports search using text describing objects in photos, such as food, animals, etc. in order to retrieve images containing the objects. It also performs extraction of text appearing in images, so if an image is captioned, a search for text appearing in the caption will give the image as a result. Faces are also extracted automatically and naming one will allow detection of all images containing this face.
As Photos is a desktop application, the database containing all the automatically extracted data is available for enthusiasts to examine at C:\Users\<UserName>\AppData\Local\Packages\Microsoft.Windows.Photos_8wekyb3d8bbwe\LocalState\MediaDB.v1.sqlite for the current version of Photos (2019.19011.19410.0).
2. Photo Gallery
Formerly known as Windows Live Photo Gallery, this free application is an image organizer, photo editor, and photo-sharing app. It is a part of Microsoft's Windows Essentials software suite, but support was discontinued in 2017, probably due to competition from Google Photos, which offers similar functionality as a web app.
However, Windows Essentials is still available for download from CNET and Photo Gallery will run under Windows 10, although the geo-tagging functionality does not work.
Photo Gallery offers a wide range of metadata addition including people (using face recognition to identify named individuals in other images), descriptive tags and a caption.
Photo Gallery Caption Display
This caption display disappears after 5 seconds unless the cursor is over the caption, and is not shown at all in slideshow mode. The size of the captions is only controllable by altering the browser text size.
Although Google's captioning facilities are somewhat limited, download of all Google Photos data does include metadata such as captions as a separate JSON file for each image, and the album structure is mapped to a folder structure, which makes import to other applications much easier.
However, only user-created albums are included. Google Photos' grouping of photos containing a particular person or object is not included in the download unless the album is shared.
The Caption Pro desktop application is able to extract caption and album name metadata from data downloaded from Google Photos and generate captions and sub-captions. If files captioned in this way are then uploaded back to Google Photos, text can easily be viewed on all image displays as shown here.
Metadata created by Photo Gallery is not shown in image pixels but is displayed beside the image when it is selected, with the Caption displayed below.
This restricts the display to Windows environments with Photo Gallery installed, although the integration with Windows means that applied metadata is visible when the cursor hovers over a file that has been processed. It also shows all image files stored on local drives on startup, which may be time-consuming.
The ready availability and comprehensive functionality would have given Photo Gallery a substantial user base, but lack of support means that future operation is uncertain. A blog by Jean-Paul Olivier gives advice on how to obtain the geo-tagging facility in Phot Gallery with other applications.
My rating: 3 stars
Google's Picasa is a discontinued image organizer and viewer for organizing and editing digital photos, plus an integrated photo-sharing website. It was superseded by Google Photos in 2016 but is still available for download from CNET. Google produced Mac and Windows versions.
Picasa offers a massive range of functionality, including the addition of captions, including those with multiple lines. Picasa's handling of multiple lines is limited. Although a new line can be initiated (by entering ctrl-enter rather than shift-enter), only the first line of a multiple-line caption is visible if the caption is typed in.
If a multiple line caption is pasted on, a larger area is used for the caption, but the area used is limited and overlays the image as shown below. The full caption can be viewed by scrolling in the caption region.
Caption data added by Picasa is only viewable within the application, but is stored in the file metadata, making it potentially accessible to other applications.
Whilst it has an impressive range of capabilities (including the production of Web albums), its deprecation by Google and the fact that its captions are only visible in Picasa make of its limited utility.
My rating: 3 stars
This is a Windows application from Golden Apple Software that can add up to 5 lines of captions below the image and creates a black and white border around the uploaded image, which can be in JPEG, GIF or PNG format. However, the aspect ratio of the selected image must be specified correctly otherwise the image is truncated.
Although the caption font and color can be selected, there is no control of the caption size, and all lines are the same height. A watermark is added to images in the unlicensed version. A license costs A$13.39. The downloaded version threw errors when saving images.
The Golden Apple Software website only describes a website builder, so it is unlikely that support is available.
My rating: 1 star
This simple application from Free Picture Solutions only operates on single JPEG images. The captioned image is saved manually. Caption font, color, position (on, above, or below the image) are specified numerically and multiple captions can be applied to the image to create a multi-line caption by a laborious process.
Captions can be edited before saving, but not afterward. Captions can be saved as image descriptions but in this case, the image pixels are not changed. The publisher's website is accessible but does not mention the product, whose copyright date is 2014. It is unlikely that the product is still supported.
My rating: 2 stars
There are a number of products with this name: Caption-Pro is a well-regarded UK-based metadata editing tool for media professionals which generates captions composed of the names of people appearing in the photo using facial recognition. There is also a poorly-reviewed iPhone app of this name offering a selection of pre-made captions for photos.
Caption Pro (from Aleka Consulting) operates with JPEG, TIFF and PNG images and video files (mp4, mov and 3gp formats only). It supports editing of already applied captions by storing them in metadata and placing the caption in a strip below the original image, thus preserving the integrity of the original image and making the caption visible from any image viewing application.
Images and videos can be loaded from a file or folder browser, by drag and drop or by pasting from the Clipboard. Automatically cropped and de-skewed Images can also be obtained from scans or photos of multiple paper prints as shown below:
Loaded images can be automatically ordered by file name or date, and manually re-ordered via a graphical interface. The drag-and-drop interface can be used to upload and merge groups of files from different sources, such as a mobile phone and a digital camera SD card.
Different Types of Captions
Captions can be added as a single block, either by typing or from speech (for Win 10 users) with either continuous or multi-line text, or in two parts with different sizes for each part, making it particularly useful for team photos. The second part of an image caption may be a region of another image (such as the scanned back of a photo with descriptive handwritten text).
Caption font size is scaled automatically to fit into the area specified for the caption bar. It offers a wide range of other features including streamlined captioning of multiple images, font, and background color selection, slideshow, zoom, aspect ratio adjustment, automatic captioning from geolocation or other metadata (such as that created by Windows Photo Gallery, Picasa, Google Photos, Windows Photos and professional metadata addition programs), and a command-line interface. Double byte characters (such as Chinese) can be included in captions, but not emojis.
Caption Pro offers two face recognition options that allow users to name detected faces in an image and have captions added comprised of the face names in left-to-right order. The Standard option, using local resources. is included in the basic license.
The Premium option uses as a commercial web service for detection and recognition, and offers superior speed and accuracy to the standard option. It requires an Internet connection and only 50 premium transactions are included in the basic license. Additional transactions must be purchased as a top-up or included in the initial license purchase.
Image caption text is stored in IPTC metadata fields as well as in the image pixels and a high-quality save option minimizes processing changes, making Caption Pro particularly suitable for genealogy and archival applications.
Caption Pro offers a slideshow view, in which only captioned images and videos are shown, making it particularly suitable for processing large collections of travel photos and videos recorded on a mobile device.
Different Options Depending on User
For non-Windows users or users without local administrator rights, a Web application for single images is available, and remote access to the full application via a Windows server with Dropbox integration is available on request. It offers a free 30-day trial and a basic license fee of US$29. The basic license includes 50 Premium face recognition transactions. Additional Premium Face recognition transactions can be purchased at a cost between US$0.02 and US$0.03 per transaction for 500 or 1000 transactions.
Caption Pro runs on 64 bit Windows versions 7, 8.1 and 10, but face recognition is only available on Windows 10. An earlier version running on 32-bit systems is available. The Aleka Consulting website is accessible and describes Caption Pro, which has a copyright date of 2020. Context-sensitive help, a local help file, and email support are all available.
My rating: 4.5 stars
The Mac version of Caption Pro from Aleka Consulting offers a free demo 30-day, 25 caption license, with a permanent license costing US$29, including 1 year of major updates.
Caption application is similar to Caption Pro (Windows). Batch processing using file metadata to generate caption and/or sub-caption text is supported.
It provides most of the facilities of Caption Pro (Windows) with the exception of face recognition and any processing of video data. The scanner interface can use the photo-detection software built into the Mac scanner interface (with no parameter adjustment but supporting editing of detected photo boundaries) or the Apple Computer Vision rectangle detection, with support for parameter adjustment but not editing of rectangle boundaries. These features are only available for MacOS version 13 (High Sierra) or later.
My rating: 4 stars.
This free app from Dare Software is an early version of the iPhone and iPad apps, dating from 2010, created with Macromedia Director. The company website is not fully functional (as of Mar 2022) but does list other applications available.The downloadable zip file contains an executable which can be run directly by double-clicking. It supports up to 5 line captions added as caption bar with black text on a white background below the photo. There is only a choice of 2 font sizes. The interface is idiosyncratic, but a help file is provided. Captioned files are resized to a maximum pixel width or height of 640 pixels before the caption is added, and metadata is not preserved. Portrait mode images have to be rotated before captions can be added.
This app is very much an initial version, but is useful for creating small captioned images which can then be emailed.
My Rating: 2.5 Stars
This is a free Windows application available via Softpedia, which, like Caption Pro, only processes JPEG images. It can process multiple files as well as single ones. Captions can be positioned on or below the image and caption size, font, and color can be selected numerically. Captions on the image are located numerically rather than by dragging.
Multi-line captions are supported, but there is a maximum of 5 lines allowed. However, there is no auto-scaling of font size to fit the caption into the space available, and once applied, captions cannot be edited, although they can be saved as EXIF Descriptions.
The user interface may be daunting to novice users. The author's website (ricksideas.com) is not accessible. The program is dated 2015 so it is probably not currently supported.
My rating: 3.5 stars
This a Universal Windows Program that only runs on Windows 10 from Fotonice, and is available from the Microsoft Store at a cost of US$2.95. It allows the addition of a frame around any image in JPEG, BMP or PNG format, as shown below.
Foto Tag allows different amounts of padding of selectable colours to be added to an image. A block of multi-line text, using a selected font may then be added anywhere in the image, including in the caption bar as shown below.
However, the text block has to be positioned manually and although the text font, style, and alignment can be selected, the font size cannot be changed. The font outline and fill colors can be selected only from the range of pastel shades shown in the bottom RH corner. The Settings screen does not access any program settings.
No support or company contact details are provided. The most unfortunate feature of Foto Tag is that selecting a different photo to process crashes the application.
Despite its flexibility in adding an area in which a caption can be written, and in framing a photo, Foto Tag is an immature product with numerous defects.
My rating: 1 star
SnipTag runs on MacOS 10.12 and higher and comes from the drolly named Hong Kong/Thailand company App Initio via the Apple App Store. The free version of SnipTag is limited to 3 exports per day. Unlimited auto-renewing licenses are available for a quarter or a year (costing US$9.99, US$26.49 respectively). A permanent license costs US$$34.99. The free version presents the Buy License option screen on startup.
As its name suggests, SnipTag has two areas of functionality. The first is to automatically crop and edit multiple photos appearing in a digital image via the Snip function. The second is to edit digital image file metadata and add visible captions to images via the Tag function. On opening SnipTag and loading a file the following screen appears.
On selecting a file and clicking Edit-> Metadata, the metadata editing screen appears, showing some of the file metadata.
Entering text in the description field (or using the dictate function) adds text to the Description field and clicking Save adds the text to a caption bar below the image, prepended by the image date taken from the metadata.
The caption can be edited via the Edit button below the image. In order to embed the caption in the image pixels, the File->Export option should be used. This offers to save the image with or without the caption in a selected location, which appears as below when opened in another application.
If this captioning functionality is all that is required, SnipTag does it OK. Caption entry is performed on a screen marked Metadata Editor by entering text in the description field. date and location can optionally be added from metadata. If any data already exists in the IPTC Description field, this is shown in the Description box but has to be copied and pasted to be applied. Captions are applied using the Save button and this operation is subject to license limitations. Double byte characters, such as Chinese, are supported, as are emojis. The application supports image files in PNG format as well as JPEGs, and probably other image file formats.
The white border added around the image is aesthetically pleasing. The use of IPTC metadata means that other programs can access metadata set by SnipTag. The User Guide is somewhat terse but included in the installation. FAQs are accessed via the Web.
However, there are some limitations. The method of caption entry is not obvious The image height used for the caption is fixed, so if you need to create a multiple-line caption (which has to imported via copy and paste, as the description and caption edit input boxes ignore Enter characters), the font size will be reduced to make the caption fit into the caption bar height. There does not seem to be any way of changing the caption font or background color. Applied captions are not editable once the image has been exported.
SnipTag is well-suited to applying short, simple captions, but the product lacks the range of features available in Windows applications such as Caption Pro and the editable captioning provided by Caption Pro for Mac. It is not particularly intuitive to use for visible caption applications, but there should be few difficulties after initial use.
My rating: 3 stars
This program from Chimera Creative Studio, in Hungary, is intended to facilitate the process of moving from physical photographs to a digital album. It incorporates functionality for processing a scan of a number of paper prints (such as might appear on an album page) to generate a set of individual digital image files with individual captions.
Each image is automatically cropped and straightened and can have a caption applied to it. Captions are added in a bar below the digital image, thus retaining all the image pixels. AutoSplitter costs US$19.99 for a permanent license, but with upgrades only available for 30 days. Upgrades are available for 2 years with a US$29.99 license. The unlicensed version adds a watermark to all images, but is fully functional otherwise.
The interface after loading a scan of a photo album page and rotating each of the detected images is shown below. AutoSplitter can also connect directly to a scanner.
The right-hand panel of the interface contains numerous controls for controlling the process of splitting an image containing multiple scanned photos and for enhancing these images.
Clicking on a single split the captioning interface appears. Caption text with a maximum length of 250 characters can be added in the purple highlighted box.