Оценка на читателите: / 9
Слаба статияОтлична статия 

Новини от света на уеб дизайна и СЕО

Представям Ви синдикирани новини от няколко от водещите сайтове в областта на уеб дизайна и СЕО - оптимизирането за търсачки.

A List Apart: The Full Feed
Articles for people who make web sites.
  • Usability Testing for Voice Content

    It’s an important time to be in voice design. Many of us are turning to voice assistants in these times, whether for comfort, recreation, or staying informed. As the interest in interfaces driven by voice continues to reach new heights around the world, so too will users’ expectations and the best practices that guide their design.

    Voice interfaces (also known as voice user interfaces or VUIs) have been reinventing how we approach, evaluate, and interact with user interfaces. The impact of conscious efforts to reduce close contact between people will continue to increase users’ expectations for the availability of a voice component on all devices, whether that entails a microphone icon indicating voice-enabled search or a full-fledged voice assistant waiting patiently in the wings for an invocation.

    But voice interfaces present inherent challenges and surprises. In this relatively new realm of design, the intrinsic twists and turns in spoken language can make things difficult for even the most carefully considered voice interfaces. After all, spoken language is littered with fillers (in the linguistic sense of utterances like hmm and um), hesitations and pauses, and other interruptions and speech disfluencies that present puzzling problems for designers and implementers alike.

    Once you’ve built a voice interface that introduces information or permits transactions in a rich way for spoken language users, the easy part is done. Nonetheless, voice interfaces also surface unique challenges when it comes to usability testing and robust evaluation of your end result. But there are advantages, too, especially when it comes to accessibility and cross-channel content strategy. The fact that voice-driven content lies on the opposite extreme of the spectrum from the traditional website confers it an additional benefit: it’s an effective way to analyze and stress-test just how channel-agnostic your content truly is.

    The quandary of voice usability

    Several years ago, I led a talented team at Acquia Labs to design and build a voice interface for Digital Services Georgia called Ask GeorgiaGov, which allowed citizens of the state of Georgia to access content about key civic tasks, like registering to vote, renewing a driver’s license, and filing complaints against businesses. Based on copy drawn directly from the frequently asked questions section of the Georgia.gov website, it was the first Amazon Alexa interface integrated with the Drupal content management system ever built for public consumption. Built by my former colleague Chris Hamper, it also offered a host of impressive features, like allowing users to request the phone number of individual government agencies for each query on a topic.

    Designing and building web experiences for the public sector is a uniquely challenging endeavor due to requirements surrounding accessibility and frequent budgetary challenges. Out of necessity, governments need to be exacting and methodical not only in how they engage their citizens and spend money on projects but also how they incorporate new technologies into the mix. For most government entities, voice is a completely different world, with many potential pitfalls.

    At the outset of the project, the Digital Services Georgia team, led by Nikhil Deshpande, expressed their most important need: a single content model across all their content irrespective of delivery channel, as they only had resources to maintain a single rendition of each content item. Despite this editorial challenge, Georgia saw Alexa as an exciting opportunity to open new doors to accessible solutions for differently abled citizens. And finally, because there were relatively few examples of voice usability testing at the time, we knew we would have to learn on the fly and experiment to find the right solution.

    Eventually, we discovered that all the traditional approaches to usability testing that we’d executed for other projects were ill-suited to the unique problems of voice usability. And this was only the beginning of our problems.

    How voice interfaces improve accessibility outcomes

    Any discussion of voice usability must consider some of the most experienced voice interface users: people who use assistive devices. After all, accessibility has long been a bastion of web experiences, but it has only recently become a focus of those implementing voice interfaces. In a world where refreshable Braille displays and screen readers prize the rendering of web-based content into synthesized speech above all, the voice interface seems like an anomaly. But in fact, the exciting potential of Amazon Alexa for differently abled citizens represented one of the primary motivations for Georgia’s interest in making their content available through a voice assistant.

    Questions surrounding accessibility with voice have surfaced in recent years due to the perceived user experience benefits that voice interfaces can offer over more established assistive devices. Because screen readers make no exceptions when they recite the contents of a page, they can occasionally present superfluous information and force the user to wait longer than they’re willing. In addition, with an effective content schema, it can often be the case that voice interfaces facilitate pointed interactions with content at a more granular level than the page itself.

    Though it can be difficult to convince even the most forward-looking clients of accessibility’s value, Georgia has been not only a trailblazer but also a committed proponent of content accessibility beyond the web. The state was among the first jurisdictions to offer a text-to-speech (TTS) phone hotline that read web pages aloud. After all, state governments must serve all citizens equally—no ifs, ands, or buts. And while these are still early days, I can see voice assistants becoming new conduits, and perhaps more efficient channels, by which differently abled users can access the content they need.

    Managing content destined for discrete channels

    Whereas voice can improve accessibility of content, it’s seldom the case that web and voice are the only channels through which we must expose information. For this reason, one piece of advice I often give to content strategists and architects at organizations interested in pursuing voice-driven content is to never think of voice content in isolation. Siloing it is the same misguided approach that has led to mobile applications and other discrete experiences delivering orphaned or outdated content to a user expecting that all content on the website should be up-to-date and accessible through other channels as well.

    After all, we’ve trained ourselves for many years to think of content in the web-only context rather than across channels. Our closely held assumptions about links, file downloads, images, and other web-based marginalia and miscellany are all aspects of web content that translate poorly to the conversational context—and particularly the voice context. Increasingly, we all need to concern ourselves with an omnichannel content strategy that straddles all those channels in existence today and others that will doubtlessly surface over the horizon.

    With the advantages of structured content in Drupal 7, Georgia.gov already had a content model amenable to interlocution in the form of frequently asked questions (FAQs). While question-and-answer formats are convenient for voice assistants because queries for content tend to come in the form of questions, the returned responses likewise need to be as voice-optimized as possible.

    For Georgia.gov, the need to preserve a single rendition of all content across all channels led us to perform a conversational content audit, in which we read aloud all of the FAQ pages, putting ourselves in the shoes of a voice user, and identified key differences between how a user would interpret the written form and how they would parse the spoken form of that same content. After some discussion with the editorial team at Georgia, we opted to limit calls to action (e.g., “Read more”), links lacking clear context in surrounding text, and other situations confusing to voice users who cannot visualize the content they are listening to.

    Here’s a table containing examples of how we converted certain text on FAQ pages to counterparts more appropriate for voice. Reading each sentence aloud, one by one, helped us identify cases where users might scratch their heads and say “Huh?” in a voice context.

    Before After
    Learn how to change your name on your Social Security card. The Social Security Administration can help you change your name on your Social Security card.
    You can receive payments through either a debit card or direct deposit. Learn more about payments. You can receive payments through either a debit card or direct deposit.
    Read more about this. In Georgia, the Family Support Registry typically pulls payments directly from your paycheck. However, you can send your own payments online through your bank account, your credit card, or Western Union. You may also send your payments by mail to the address provided in your court order.

    In areas like content strategy and content governance, content audits have long been key to understanding the full picture of your content, but it doesn’t end there. Successful content audits can run the gamut from automated checks for orphaned content or overly wordy articles to more qualitative analyses of how content adheres to a specific brand voice or certain design standards. For a content strategy truly prepared for channels both here and still to come, a holistic understanding of how users will interact with your content in a variety of situations is a baseline requirement today.

    Other conversational interfaces have it easier

    Spoken language is inherently hard. Even the most gifted orators can have trouble with it. It’s littered with mistakes, starts and stops, interruptions, hesitations, and a vertiginous range of other uniquely human transgressions. The written word, because it’s committed instantly to a mostly permanent record, is tame, staid, and carefully considered in comparison.

    When we talk about conversational interfaces, we need to draw a clear distinction between the range of user experiences that traffic in written language rather than spoken language. As we know from the relative solidity of written language and literature versus the comparative transience of spoken language and oral traditions, in many ways the two couldn’t be more different from one another. The implications for designers are significant because spoken language, from the user’s perspective, lacks a graphical equivalent to which those scratching their head can readily refer. We’re dealing with the spoken word and aural affordances, not pixels, written help text, or visual affordances.

    Why written conversational interfaces are easier to evaluate

    One of the privileges that chatbots and textbots enjoy over voice interfaces is the fact that by design, they can’t hide the previous steps users have taken. Any conversational interface user working in the written medium has access to their previous history of interactions, which can stretch back days, weeks, or months: the so-called backscroll. A flight passenger communicating with an airline through Facebook Messenger, for example, knows that they can merely scroll up in the chat history to confirm that they’ve already provided the company with their e-ticket number or frequent flyer account information.

    This has outsize implications for information architecture and conversational wayfinding. Since chatbot users can consult their own written record, it’s much harder for things to go completely awry when they make a move they didn’t intend. Recollection is much more difficult when you have to remember what you said a few minutes ago off the top of your head rather than scrolling up to the information you provided a few hours or weeks ago. An effective chatbot interface may, for example, enable a user to jump back to a much earlier, specific place in a conversation’s history.An effective chatbot interface may, for example, enable a user to jump back to a much earlier, specific place in a conversation’s history. Voice interfaces that live perpetually in the moment have no such luxury.

    Eye tracking only works for visual components

    In many cases, those who work with chatbots and messaging bots (especially those leveraging text messages or other messaging services like Facebook Messenger, Slack, or WhatsApp) have the unique privilege of benefiting from a visual component. Some conversational interfaces now insert other elements into the conversational flow between a machine and a person, such as embedded conversational forms (like SPACE10’s Conversational Form) that allow users to enter rich input or select from a range of possible responses.

    The success of eye tracking in more traditional usability testing scenarios highlights its appropriateness for visual interfaces such as websites, mobile applications, and others. However, from the standpoint of evaluating voice interfaces that are entirely aural, eye tracking serves only the limited (but still interesting from a research perspective) purpose of assessing where the test subject is looking while speaking with an invisible interlocutor—not whether they are able to use the interface successfully. Indeed, eye tracking is only a viable option for voice interfaces that have some visual component, like the Amazon Echo Show.

    Think-aloud and concurrent probing interrupt the conversational flow

    A well-worn approach for usability testing is think-aloud, which allows for users working with interfaces to present their frequently qualitative impressions of interfaces verbally while interacting with the user experience in question. Paired with eye tracking, think-aloud adds considerable dimension to a usability test for visual interfaces such as websites and web applications, as well as other visually or physically oriented devices.

    Another is concurrent probing (CP). Probing involves the use of questions to gather insights about the interface from users, and Usability.gov describes two types: concurrent, in which the researcher asks questions during interactions, and retrospective, in which questions only come once the interaction is complete.

    Conversational interfaces that utilize written language rather than spoken language can still be well-suited to think-aloud and concurrent probing approaches, especially for the components in the interface that require manual input, like conversational forms and other traditional UI elements interspersed throughout the conversation itself.

    But for voice interfaces, think-aloud and concurrent probing are highly questionable approaches and can catalyze a variety of unintended consequences, including accidental invocations of trigger words (such as Alexa mishearing “selected” as “Alexa”) and introduction of bad data (such as speech transcription registering both the voice interface and test subject). After all, in a hypothetical think-aloud or CP test of a voice interface, the user would be responsible for conversing with the chatbot while simultaneously offering up their impressions to the evaluator overseeing the test.

    Voice usability tests with retrospective probing

    Retrospective probing (RP), a lesser-known approach for usability testing, is seldom seen in web usability testing due to its chief weakness: the fact that we have awful memories and rarely remember what occurred mere moments earlier with anything that approaches total accuracy. (This might explain why the backscroll has joined the pantheon of rigid recordkeeping currently occupied by cuneiform, the printing press, and other means of concretizing information.)

    For users of voice assistants lacking scrollable chat histories, retrospective probing introduces the potential for subjects to include false recollections in their assessments or to misinterpret the conclusion of their conversations. That said, retrospective probing permits the participant to take some time to form their impressions of an interface rather than dole out incremental tidbits in a stream of consciousness, as would more likely occur in concurrent probing.

    What makes voice usability tests unique

    Voice usability tests have several unique characteristics that distinguish them from web usability tests or other conversational usability tests, but some of the same principles unify both visual interfaces and their aural counterparts. As always, “test early, test often” is a mantra that applies here, as the earlier you can begin testing, the more robust your results will be. Having an individual to administer a test and another to transcribe results or watch for signs of trouble is also an effective best practice in settings beyond just voice usability.

    Interference from poor soundproofing or external disruptions can derail a voice usability test even before it begins. Many large organizations will have soundproof rooms or recording studios available for voice usability researchers. For the vast majority of others, a mostly silent room will suffice, though absolute silence is optimal. In addition, many subjects, even those well-versed in web usability tests, may be unaccustomed to voice usability tests in which long periods of silence are the norm to establish a baseline for data.

    How we used retrospective probing to test Ask GeorgiaGov

    For Ask GeorgiaGov, we used the retrospective probing approach almost exclusively to gather a range of insights about how our users were interacting with voice-driven content. We endeavored to evaluate interactions with the interface early and diachronically. In the process, we asked each of our subjects to complete two distinct tasks that would require them to traverse the entirety of the interface by asking questions (conducting a search), drilling down into further questions, and requesting the phone number for a related agency. Though this would be a significant ask of any user working with a visual interface, the unidirectional focus of voice interface flows, by contrast, reduced the likelihood of lengthy accidental detours.

    Here are a couple of example scenarios:

    You have a business license in Georgia, but you’re not sure if you have to register on an annual basis. Talk with Alexa to find out the information you need. At the end, ask for a phone number for more information.

    You’ve just moved to Georgia and you know you need to transfer your driver’s license, but you’re not sure what to do. Talk with Alexa to find out the information you need. At the end, ask for a phone number for more information.

    We also peppered users with questions after the test concluded to learn about their impressions through retrospective probing:

    • “On a scale of 1–5, based on the scenario, was the information you received helpful? Why or why not?”
    • “On a scale of 1–5, based on the scenario, was the content presented clear and easy to follow? Why or why not?”
    • “What’s the answer to the question that you were tasked with asking?”

    Because state governments also routinely deal with citizen questions having to do with potentially traumatic issues such as divorce and sexual harassment, we also offered the choice for participants to opt out of certain categories of tasks.

    While this testing procedure yielded compelling results that indicated our voice interface was performing at the level it needed to despite its experimental nature, we also ran into considerable challenges during the usability testing process. Restoring Amazon Alexa to its initial state and troubleshooting issues on the fly proved difficult during the initial stages of the implementation, when bugs were still common.

    In the end, we found that many of the same lessons that apply to more storied examples of usability testing were also relevant to Ask GeorgiaGov: the importance of testing early and testing often, the need for faithful yet efficient transcription, and the surprising staying power of bugs when integrating disparate technologies. Despite Ask GeorgiaGov’s many similarities to other interface implementations in terms of technical debt and the role of usability testing, we were overjoyed to hear from real Georgians whose engagement with their state government could not be more different from before.

    Conclusion

    Many of us may be building interfaces for voice content to experiment with newfangled channels, or to build for differently-abled people and people newer to the web. Now, they are necessities for many others, especially as social distancing practices continue to take hold worldwide. Nonetheless, it’s crucial to keep in mind that voice should be only one component of a channel-agnostic strategy equipped for content ripped away from its usual contexts. Building usable voice-driven content experiences can teach us a great deal about how we should envisage our milieu of content and its future in the first place.

    Gone are the days when we could write a page in HTML and call it a day; content now needs to be rendered through synthesized speech, augmented reality overlays, digital signage, and other environments where users will never even touch a personal computer. By focusing on structured content first and foremost with an eye toward moving past our web-based biases in developing our content for voice and others, we can better ensure the effectiveness of our content on any device and in any form factor.

    Eight months after we finished building Ask GeorgiaGov in 2017, we conducted a retrospective to inspect the logs amassed over the past year. The results were striking. Vehicle registration, driver’s licenses, and the state sales tax comprised the most commonly searched topics. 79.2% of all interactions were successful, an achievement for one of the first content-driven Alexa skills in production, and 71.2% of all interactions led to the issuance of a phone number that users could call for further information.

    But deep in the logs we implemented for the Georgia team’s convenience, we found a number of perplexing 404 Not Found errors related to a search term that kept being recorded over and over again as “Lawson’s.” After some digging and consulting the native Georgians in the room, we discovered that one of our dear users with a particularly strong drawl was repeatedly pronouncing “license” in her native dialect to no avail.

    As this anecdote highlights, just as no user experience can be truly perfect for everyone, voice content is an environment where imperfections can highlight considerations we missed in developing cross-channel content. And just as we have much to learn when it comes to the new shapes content can take as it jumps off the screen and out the window, it seems our voice interfaces still have a ways to go before they take over the world too.

    Special thanks to Nikhil Deshpande for his feedback during the writing process.

  • Cross-Cultural Design

    When I first traveled to Japan as an exchange student in 2001, I lived in northern Kyoto, a block from the Kitayama subway station.

    My first time using the train to get to my university was almost a disaster, even though it was only two subway stops away. I thought I had everything I needed to successfully make the trip. I double- and triple-checked that I had the correct change in one pocket and a computer printout of where I was supposed to go in the other. I was able to make it down into the station, but then I just stood at a ticket machine, dumbfounded, looking at all the flashing lights, buttons, and maps above my head (Fig 5.1). Everything was so impenetrable. I was overwhelmed by the architecture, the sounds, the signs, and the language.

    Photo of two subway ticket machines with complex maps above them
    Fig 5.1: Kyoto subway ticket machines—with many line maps and bilingual station names—can seem complicated, especially to newcomers.

    My eyes craved something familiar—and there it was. The ticket machine had a small button that said English! I pushed it but became even more lost: the instructions were poorly translated, and anyway, they explained a system that I couldn’t use in the first place.

    Guess what saved me? Two little old Japanese ladies. As they bought tickets, I casually looked over their shoulders to see how they were using the machines. First, they looked up at the map to find their desired destination. Then, they noted the fare written next to the station. Finally, they put some money into the machine, pushed the button that lit up with their correct fare, and out popped the tickets! Wow! I tried it myself after they left. And after a few tense moments, I got my ticket and headed through the gates to the train platform.

    I pride myself on being a third-culture kid, meaning I was raised in a culture other than the country named on my passport. But even with a cultural upbringing in both Nigeria and the US, it was one of the first times I ever had to guess my way through a task with no previous reference points. And I did it!

    Unfortunately, the same guesswork happens online a million times a day. People visit sites that offer them no cultural mental models or visual framework to fall back on, and they end up stumbling through links and pages. Effective visual systems can help eliminate that guesswork and uncertainty by creating layered sets of cues in the design and interface. Let’s look at a few core parts of these design systems and tease out how we can make them more culturally responsive and multifaceted.

    Typography

    If you work on the web, you deal with typography all the time. This isn’t a book about typography—others have written far more eloquently and technically on the subject. What I would like to do, however, is examine some of the ways culture and identity influence our perception of type and what typographic choices designers can make to help create rich cross-cultural experiences.

    Stereotypography

    I came across the word stereotypography a few years ago. Being African, I’m well aware of the way my continent is portrayed in Western media—a dirt-poor, rural monoculture with little in the way of technology, education, or urbanization. In the West, one of the most recognizable graphic markers for things African, tribal, or uncivilized (and no, they are not the same thing) is the typeface Neuland. Rob Giampietro calls it “the New Black Face,” a clever play on words. In an essay, he asks an important question:

    How did [Neuland and Lithos] come to signify Africans and African-Americans, regardless of how a designer uses them, and regardless of the purpose for which their creators originally intended them? (http://bkaprt.com/ccd/05-01/)

    From its release in 1923 and continued use through the 1940s in African-American-focused advertising, Neuland has carried heavy connotations and stereotypes of cheapness, ugliness, tribalism, and roughness. You see this even today. Neuland is used in posters for movies like Tarzan, Jurassic Park, and Jumanji—movies that are about jungles, wildness, and scary beasts lurking in the bush, all Western symbolism for the continent of Africa. Even MyFonts’ download page for Neuland (Fig 5.2) includes tags for “Africa,” “jungle fever,” and “primitive”—tags unconnected to anything else in the product besides that racist history.

    Fig 5.2: On MyFonts, the Neuland typeface is tagged with “Africa”, “jungle fever”, and “primitive”, perpetuating an old and irrelevant typographic stereotype (http://bkaprt.com/ccd/05-02/).

    Don’t make, use, or sell fonts this way. Here are some tips on how to avoid stereotypography when defining your digital experiences:

    • Be immediately suspicious of any typeface that “looks like” a culture or country. For example, so-called “wonton” or “chop-suey” fonts, whose visual style is thought to express “Asianness” or to suggest Chinese calligraphy, have long appeared on food cartons, signs, campaign websites, and even Abercrombie & Fitch T-shirts with racist caricatures of Asians (http://bkaprt.com/ccd/05-03/). Monotype’s website, where you can buy a version called Mandarin Regular (US$35), cringingly describes the typeface’s story as “an interpretation of artistically drawn Asian brush calligraphy” (Fig 5.3). Whether or not you immediately know its history, run away from any typeface that purports to represent an entire culture.
    A font called "Mandarin" with a stereotypical Asian aesthetic
    Fig 5.3: Fonts.com sells a typeface called Mandarin Regular with the following description: “The stylized Asian atmosphere is not created only by the forms of the figures but also by the very name of the typeface. A mandarin was a high official of the ancient Chinese empire” (http://bkaprt.com/ccd/05-04/).
    • Support type designers who are from the culture you are designing for. This might seem like it’s a difficult task, but the internet is a big place. I have found that, for clients who are sensitive to cultural issues, the inclusion of type designers’ names and backgrounds can be a powerful differentiator, even making its way into their branding packages as a point of pride.

    The world wide webfont

    Another common design tool you should consider is webfonts—fonts specifically designed for use on websites and apps. One of the main selling points of webfonts is that instead of putting text in images, clients can use live text on their sites, which is better for SEO and accessibility. They are simple to implement these days, a matter of adding a line of code or checking a box on a templating engine. The easiest way to get them on your site is by using a service like Google Fonts, Fontstand, or Adobe Fonts.

    Or is it? That assumes those services are actually available to your users.

    Google Fonts (and every other service using Google’s Developer API) is blocked in mainland China, which means that any of those nice free fonts you chose would simply not load (http://bkaprt.com/ccd/05-05/). You can work around this, but it also helps to have a fallback font—that’s what they’re for.

    When you’re building your design system, why not take a few extra steps to define some webfonts that are visible in places with content blocks? Justfont is one of the first services focused on offering a wide range of Chinese webfonts (http://bkaprt.com/ccd/05-06/). They have both free and paid tiers of service, similar to Western font services. After setting up an account, you can grab whatever CSS and font-family information you need.

    Multiple script systems

    When your design work requires more than one script—for instance, a Korean typeface and a Latin typeface—your choices get much more difficult. Designs that incorporate more than one are called multiple script systems (multiscript systems for short). Combining them is an interesting design challenge, one that requires extra typographic sensitivity. Luckily, your multiscript choices will rarely appear on the same page together; you will usually be choosing fonts that work across the brand, not that work well next to one another visually.

    Let’s take a look at an example of effective multiscript use. SurveyMonkey, an online survey and questionnaire tool, has their site localized into a variety of different languages (Fig 5.4). Take note of the headers, the structure of the text in the menu and buttons, and how both fonts feel like part of the same brand.

    A SurveyMonkey page in Korean with a simple aesthetic A SurveyMonkey page in English with a simple and similar aesthetic
    Fig 5.4: Compare the typographic choices in the Korean (http://bkaprt.com/ccd/05-07/) and US English (http://bkaprt.com/ccd/05-08/) versions of SurveyMonkey’s Take a Tour page. Do the header type and spacing retain the spirit of the brand while still accounting for typographic needs?

    Some tips as you attempt to choose multiscript fonts for your project:

    • Inspect the overall weight and contrast level of the scripts. Take the time to examine how weight and contrast are used in the scripts you’re using. Find weights and sizes that give you a similar feel and give the page the right balance, regardless of the script.
    • Keep an eye on awkward script features. Character x-heights, descenders, ascenders, and spacing can throw off the overall brand effect. For instance, Japanese characters are always positioned within a grid with all characters designed to fit in squares of equal height and width. Standard Japanese typefaces also contain Latin characters, called romaji. Those Latin characters will, by default, be kerned according to that same grid pattern, often leaving their spacing awkward and ill-formed. Take the extra time to find a typeface that doesn’t have features that are awkward to work with.
    • Don’t automatically choose scripts based on superficial similarity. Initial impressions don’t always mean a typeface is the right one for your project. In an interview in the book Bi-Scriptual, Jeongmin Kwon, a typeface designer based in France, offers an example (http://bkaprt.com/ccd/05-09/). Nanum Myeongjo, a contemporary Hangul typeface, might at first glance look really similar to a seventeenth-century Latin old-style typeface—for instance, they both have angled serifs. However, Nanum Myeongjo was designed in 2008 with refined, modern strokes, whereas old-style typefaces were originally created centuries ago and echo handwritten letterforms (http://bkaprt.com/ccd/05-10/). Looking at the Google Fonts page for Nanum Myeongjo, though, none of that is clear (Fig 5.5). The page automatically generates a Latin Nn glyph in the top left of the page, instead of a more representative Hangul character sample. If I based my multiscript font choices on my initial reactions to that page, my pairings wouldn’t accurately capture the history and design of each typeface.
    A font with a large sample character in Latin text rather than a more representative Hangul character
    Fig 5.5: The Google Fonts page for Nanum Myeongjo shows a Latin character sample in the top left, rather than a more representative character sample.

    Visual density

    CSS can help you control visual density—how much text, image, and other content there is relative to the negative space on your page. As you read on, keep cultural variables in mind: different cultures value different levels of visual density.

    Let’s compare what are commonly called CJK (Chinese, Japanese, Korean) alphabets and Latin (English, French, Italian, etc.) alphabets. CJK alphabets have more complex characters, with shapes that are generally squarer than Latin letterforms. The glyphs also tend to be more detailed than Latin ones, resulting in a higher visual density.

    Your instinct might be to create custom type sizes and line heights for each of your localized pages. That is a perfectly acceptable option, and if you are a typophile, it may drive you crazy not to do it. But I’m here to tell you that­ when adding CJK languages to a design system, you can update it to account for their visual density without ripping out a lot of your original CSS:

    1. Choose a font size that is slightly larger for CJK characters, because of their density.
    2. Choose a line height that gives you ample vertical space between each line of text (referred to as line-height in CSS).
    3. Look at your Latin text in the same sizes and see if it still works.
    4. Tweak them together to find a size that works well with both scripts.

    The 2017 site for Typojanchi, the Korean Typography Biennale, follows this methodology (Fig 5.6). Both the English and Korean texts have a font-size of 1.25em, and a line-height of 1.5. The result? The English text takes up more space vertically, and the block of Korean text is visually denser, but both are readable and sit comfortably within the overall page design. It is useful to compare translated websites like this to see how CSS styling can be standardized across Latin and CJK pages.

    A basic layout with English textA basic layout with Korean text
    Fig 5.6: The 2017 site for Typojanchi, the Korean Typography Biennale, shows differing visual density in action. It is useful to compare translated websites like this to see how CSS styling can be standardized across Latin and CJK pages (http://bkaprt.com/ccd/05-11/).

    Text expansion factors

    Expansion factors calculate how long strings of text will be in different languages. They use either a decimal (1.8) or a percentage (180%) to calculate the length of a text string in English versus a different language. Of course, letter-spacing depends on the actual word or phrase, but think of them as a very rough way to anticipate space for text when it gets translated.

    Using expansion factors is best when planning for microcopy, calls to action, and menus, rather than long-form content like articles or blog posts that can freely expand down the page. The Salesforce Lightning Design System offers a detailed expansion-factor table to help designers roughly calculate space requirements for other languages in a UI (Fig 5.7).

    A chart showing how a different piece of content lays out in different languages
    Fig 5.7: This expansion-factor table from Salesforce lets designers and developers estimate the amount of text that will exist in different languages. Though dependent on the actual words, such calculations can give you a benchmark to design with content in mind (http://bkaprt.com/ccd/05-12/).

    But wait! Like everything in cross-cultural design, nothing is ever that simple. Japanese, for example, has three scripts: Kanji, for characters of Chinese origin, hiragana, for words and sounds that are not represented in kanji, and katakana, for words borrowed from other languages.

    The follow button is a core part of the Twitter experience. It has six characters in English (“Follow”) and four in Japanese (フォロー), but the Japanese version is twenty percent longer because it is in katakana, and those characters take up more space than kanji (Fig 5.8). Expansion tables can struggle to accommodate the complex diversity of human scripts and languages, so don’t look to them as a one-stop or infallible solution.

    The Twitter UI in JapaneseThe Twitter UI in English
    Fig 5.8: On Twitter, expansion is clearly visible: the English “Follow” button text comes in at about 47 pixels wide, while the Japanese text comes in at 60 pixels wide.

    Here are a few things you can do keep expansion factors in mind as you design:

    • Generate dummy text in different languages for your design comps. Of course, you should make sure your text doesn’t contain any unintentional swearwords or improper language, but tools like Foreign Ipsum are a good place to start getting your head around expansion factors (http://bkaprt.com/ccd/05-13/).
    • Leave extra space around buttons, menu items, and other microcopy. As well as being general good practice in responsive design, this allows you to account for how text in your target languages expands.
    • Make sure your components are expandable. Stay away from assigning a fixed width to your UI elements unless it’s unavoidable.
    • Let longer text strings wrap to a second line. Just ensure that text is aligned correctly and is easy to scan.
  • Standards for Writing Accessibly

    Writing to meet WCAG2 standards can be a challenge, but it’s worthwhile. Albert Einstein, the archetypical genius and physicist, once said, “Any fool can make things bigger, more complex, and more violent. It takes a touch of genius—and a lot of courage—to move in the opposite direction.”

    Hopefully, this entire book will help you better write for accessibility. So far, you’ve learned:

    • Why clarity is important
    • How to structure messages for error states and stress cases
    • How to test the effectiveness of the words you write

    All that should help your writing be better for screen readers, give additional context to users who may need it, and be easier to parse.

    But there are a few specific points that you may not otherwise think about, even after reading these pages.

    Writing for Screen Readers

    People with little or no sight interact with apps and websites in a much different way than sighted people do. Screen readers parse the elements on the screen (to the best of their abilities) and read it back to the user. And along the way, there are many ways this could go wrong. As the interface writer, your role is perhaps most important in giving screen reader users the best context.

    Here are a few things to keep in mind about screen readers:

    • The average reading time for sighted readers is two to five words per second. Screen-reader users can comprehend text being read at an average of 35 syllables per second, which is significantly faster. Don’t be afraid to sacrifice brevity for clarity, especially when extra context is needed or useful.
    • People want to be able to skim long blocks of text, regardless of sight or audio, so it’s extremely important to structure your longform writing with headers, short paragraphs, and other content design best practices.

    Write Chronologically, Not Spatially

    Writing chronologically is about describing the order of things, rather than where they appear spatially in the interface. There are so many good reasons to do this (devices and browsers will render interfaces differently), but screen readers show you the most valuable reason. You’ll often be faced with writing tooltips or onboarding elements that say something like, “Click the OK button below to continue.” Or “See the instructions above to save your document.”

    Screen readers will do their job and read those instructions aloud to someone who can’t see the spatial relationships between words and objects. While many times, they can cope with that, they shouldn’t have to. Consider screen reader users in your language. Embrace the universal experience shared by humans and rely on their intrinsic understanding of the top is first, bottom is last paradigm. Write chronologically, as in Figure 5.5.

    A screenshot of a form with a password field. The password hints show up below the password field.
    FIGURE 5.5 Password hint microcopy below the password field won’t help someone using a screen reader who hasn’t made it there yet.

    Rather than saying:

    • Click the OK button below to continue.
    • (A button that scrolls you to the top of a page): Go to top.

    Instead, say:

    • Next, select OK to continue.
    • Go to beginning.

    Write Left to Right, Top to Bottom

    While you don’t want to convey spatial meaning in your writing, you still want to keep that spatial order in mind.

    Have you ever purchased a service or a product, only to find out later that there were conditions you didn’t know about before you paid for it? Maybe you didn’t realize batteries weren’t included in that gadget, or that signing up for that social network, you were implicitly agreeing to provide data to third-party advertisers.

    People who use screen readers face this all the time.

    Most screen readers will parse information from left to write, from top to bottom.1 Think about a few things when reviewing the order and placement of your words. Is there information critical to performing an action, or making a decision, that appears after (to the right or below) an action item, like in Figure 5.5? If so, consider moving it up in the interface.

    Instead, if there’s information critical to an action (rules around setting a password, for example, or accepting terms of service before proceeding), place it before the text field or action button. Even if it’s hidden in a tooltip or info button, it should be presented before a user arrives at a decision point.

    Don’t Use Colors and Icons Alone

    If you are a sighted American user of digital products, there’s a pretty good chance that if you see a message in red, you’ll interpret it as a warning message or think something’s wrong. And if you see a message in green, you’ll likely associate that with success. But while colors aid in conveying meaning to this type of user, they don’t necessarily mean the same thing to those from other cultures.

    For example, although red might indicate excitement, or danger in the U.S. (broadly speaking), in other cultures it means something entirely different:

    • In China, it represents good luck.
    • In some former-Soviet, eastern European countries it’s the color strongly associated with Communism.
    • In India, it represents purity.

    Yellow, which we in the U.S. often use to mean “caution” (because we’re borrowing a mental model from traffic lights), might convey another meaning for people in other cultures:

    • In Latin America, yellow is associated with death.
    • In Eastern and Asian cultures, it’s a royal color—sacred and often imperial.

    And what about users with color-blindness or low to no vision? And what about screen readers? Intrinsic meaning from the interface color means nothing for them. Be sure to add words that bear context so that if you heard the message being read aloud, you would understand what was being said, as in Figure 5.6.

    Two error tooltips: one that says 'Save your work!' and one that says 'Save your work before going to the next step'
    FIGURE 5.6 While a simple in-app message warning a user to save their work before proceeding is more effective, visually, if it is red and has a warning icon, as seen on the left, you should provide more context when possible. The example on the right explicitly says that a user won’t be able to proceed to the next step before saving their work.

    Describe the Action, Not the Behavior

    Touch-first interfaces have been steadily growing and replacing keyboard/mouse interfaces for years, so no longer are users “clicking” a link or a button. But they’re not necessarily “tapping” it either, especially if they’re using a voice interface or an adaptive device.

    Instead of microcopy that includes behavioral actions like:

    • Click
    • Tap
    • Press
    • See

    Try device-agnostic words that describe the action, irrespective of the interface, like:

    • Choose
    • Select
    • View

    There are plenty of exceptions to this rule. If your interface requires a certain action to execute a particular function, and you need to teach the user how their gesture affects the interface (“Pinch to zoom out,” for example), then of course you need to describe the behavior. But generally, the copy you’re writing will be simpler and more consistent if you stick with the action in the context of the interface itself.

  • Making Room for Variation

    Making a brand feel unified, cohesive, and harmonious while also leaving room for experimentation is a tough balancing act. It’s one of the most challenging aspects of a design system.

    Graphic designer and Pentagram partner Paula Scher faced this challenge with the visual identity for the Public Theater in New York. As she explained in a talk at Beyond Tellerrand:

    I began to realize that if you made everything the same, it was boring after the first year. If you changed it individually for each play, the theater lost recognizability. The thing to do, which I totally got for the first time after working there at this point for 17 years, is what they needed to have were seasons.

    You could take the typography and the color system for the summer festival, the Shakespeare in the Park Festival, and you could begin to translate it into posters by flopping the colors, but using some of the same motifs, and you could create entire seasons out of the graphics. That would become its own standards manual where I have about six different people making these all year (http://bkaprt.com/eds/04-01/).

    Scher’s strategy was to retain the Public Theater’s visual language every year, but to vary some of its elements (Fig 4.1–2). Colors would be swapped. Text would skew in different directions. New visual motifs would be introduced. The result is that each season coheres in its own way, but so does the identity of the Public Theater as a whole.

    Sixteen Public Theater posters in black, white, and yellow, with slanted wood type letterforms and high-contrast images of people.
    Fig 4.1: The posters for the 2014/15 season featured the wood type style the Public Theater is known for, but the typography was skewed. The color palette was restrained to yellow, black, and white, which led to a dynamic look when coupled with the skewed type (http://bkaprt.com/eds/04-02/).
    Twelve Public Theater posters using black, white, and pastel colors with wood type letterforms and softer images of people.
    Fig 4.2: For the 2018 season, the wood type letterforms were extended on a field of gradated color. The grayscale cut-out photos we saw in the 2014/15 season persisted, but this time in lower contrast to fit better with the softer color tones (http://bkaprt.com/eds/04-03/).

    Even the most robust or thoroughly planned systems will need to account for variation at some point. As soon as you release a design system, people will ask you how to deviate from it, and you’ll want to be armed with persuasive answers. In this chapter, I’m going to talk about what variation means for a design system, how to know when you need it, and how to manage it in a scalable way.

    What Is Variation?

    We’ve spent most of this book talking about the importance of unity, cohesion, and harmony in a design system. So why are we talking about variation? Isn’t that at odds with all of the goals we’ve set until now?

    Variation is a deviation from established patterns, and it can exist at every level of the system. At the component level, for instance, a team may discover that they need a component to behave in a slightly different way; maybe this particular component needs to appear without a photo, for example. At a design-language level, you may have a team that has a different audience, so they want to adjust their brand identity to serve that audience better. You can even have variation at the level of design principles: if a team is working on a product that is functionally different from your core product, they may need to adjust their principles to suit that context.

    There are three kinds of deviations that come up in a design system:

    • Unintentional divergence typically happens when designers can’t find the information they’re looking for. They may not know that a certain solution exists within a system, so they create their own style. Clear, easy-to-find documentation and usage guidelines can help your team avoid unintentional variation.
    • Intentional but unnecessary divergence usually results from designers not wanting to feel constrained by the system, or believing they have a better solution. Making sure your team knows how to push back on and contribute to the system can help mitigate this kind of variation.
    • Intentional, meaningful divergence is the goal of an expressive design system. In this case, the divergence is meaningful because it solves a very specific user problem that no existing pattern solves.

    We want to enable intentional, meaningful variation. To do this, we need to understand the needs and contexts for variation.

    Contexts for Variation

    Every variation we add makes our design system more complicated. Therefore, we need to take care to find the right moments for variation. Three big contextual changes are served by variation: brand, audience, and environment.

    Brand

    If you’re creating a system for multiple brands, each with its own brand language, then your system needs to support variations to reflect those brands.

    The key here is to find the common core elements and then set some criteria for how you should deviate. When we were creating the design system for our websites at Vox Media, we constantly debated which elements should feel more expressive. Should a footer be standardized, or should we allow for tons of customization? We went back to our core goals: our users were ultimately visiting our websites to consume editorial content. So the variations should be in service of the content, writing style, and tone of voice for each brand.

    The newsletter modules across Vox Media brands were an example of unnecessary variation. They were consistent in functionality and layout, but had variations in type, color, and visual treatments like borders (Fig 4.3). There was quite a bit of custom design within a very small area: Curbed’s newsletter component had a skewed background, for example, while Eater’s had a background image. Because these modules were so consistent in their user goals, we decided to unify their design and create less variation (Fig 4.4).

    Fig 4.3: Older versions of Vox Media’s newsletter modules contained lots of unnecessary visual variation.
    Three examples of newsletter modules, showing the same colors, fonts, and spacing.
    Fig 4.4: The new, unified newsletter modules.

    The unified design cleaned up some technical debt. In the previous design, each newsletter module had CSS overrides to achieve distinct styling. Some modules even had overrides on the primary button color so it would work better with the background color. Little CSS overrides like this add up over time. Whenever we released a new change, we’d have to manually update the spots containing CSS overrides.

    The streamlined design also placed a more appropriate emphasis on the newsletter module. While important, this module isn’t the star of the page. It doesn’t need loud backgrounds or fancy shapes to command attention, especially since it’s placed around article content. Variation in this module wasn’t necessary for expressing the brands.

    On the other hand, consider the variation in Vox Media’s global header components. When we were redesigning the Verge, its editorial teams were vocal about wanting more latitude to art-direct the page, guide attention toward big features, and showcase custom illustrations. We addressed this by creating a masthead component (Fig 4.5) that sits on top of the global header on homepages. It contains a logo, tagline, date, and customizable background image. Though at the time this was a one-off component, we felt that the variation was valuable because it would strengthen the Verge’s brand voice.

    Example of the Verge’s masthead component with magenta and blue abstractions. Example of the Verge’s masthead component with a city skyline in orange tones. Example of the Verge’s masthead component in pixelated black and white.
    Fig 4.5: Examples of the Verge's masthead component

    The Verge team commissions or makes original art that changes throughout the day. The most exciting part is that they can use the masthead and a one-up hero when they drop a big feature and use these flexible components to art-direct the page (Fig 4.6). Soon after launch, the Verge masthead even got a Twitter fan account (@VergeTaglines) that tweets every time the image changes.

    Comparison of the Verge’s homepage, changing based on the masthead design and hero photography. Comparison of the Verge’s homepage, changing based on the masthead design and hero photography.
    Fig 4.6: The Verge uses two generic components, the masthead and one-up hero, to art-direct its homepages.

    Though this component was built specifically for the Verge, it soon gained broader application with other brands that share Vox’s publishing platform, Chorus. The McElroy Family website, for example, needed to convey its sense of humor and Appalachian roots; the masthead component shines with an original illustration featuring an adorable squirrel (Fig 4.7).

    The masthead component for the McElroy Family, showing a blue navigation bar and a pastel illustration of a forest.
    Fig 4.7: The McElroy Family site uses the same masthead component as the Verge to display a custom illustration.
    The masthead component for the Chicago Sun-Times, showing a white background, stark black text, and a red Subscribe button.
    Fig 4.8: The same masthead component on the Chicago Sun-Times site.

    The Chicago Sun-Times—another Chorus platform site—is very different in content, tone, and audience from The McElroy Family, but the masthead component is just as valuable in conveying the tone of the organization’s high-quality investigative journalism and breaking news coverage (Fig 4.8).

    Why did the masthead variation work well while the newsletter variation didn’t? The variations on the newsletter design were purely visual. When we created them, we didn’t have a strategy for how variation should work; instead, we were looking for any opportunity to make the brands feel distinct. The masthead variation, by contrast, tied directly into the brand strategy. Even though it began as a one-off for the Verge, it was flexible and purposeful enough to migrate to other brands.

    Audience

    The next contextual variation comes from audience. If your products serve different audiences who all need different things, then your system may need to adapt to fit those needs.

    A good example of this is Airbnb’s listing pages. In addition to their standard listings, they also have Airbnb Plus—one-of-a-kind, high quality rentals at higher price points. Audiences booking a Plus listing are probably looking for exceptional quality and attention to detail.

    Both Airbnb’s standard listing page and Plus listing page are immediately recognizable as belonging to the same family because they use many consistent elements (Fig 4.9). They both use Airbnb’s custom font, Cereal. They both highlight photography. They both use many of the same components, like the date picker. The iconography is the same.

    Screenshot of AirBnB's standard listing Screenshot of AirBnB's Plus listing
    Fig 4.9: The same brand elements in Airbnb’s standard listings (above) are used in their Plus listings (below), but with variations that make the listing styles distinct.

    However, some of the design choices convey a different attitude. Airbnb Plus uses larger typography, airier vertical space, and a lighter weight of Cereal. It has a more understated color palette, with a deeper color on the call to action. These choices make Airbnb Plus feel like a more premium experience. You can see they’ve adjusted the density, weight, and scale levers to achieve a more elegant and sophisticated aesthetic.

    The standard listing page, on the other hand, is more functional, with the booking module front and center. The Plus design pulls the density and weight levers in a lighter, airier direction. The standard listing page has less size contrast between elements, making it feel more functional.

    Because they use the same core building blocks—the same typography, iconography, and components—both experiences feel like Airbnb. However, the variations in spacing, typographic weights, and color help distinguish the standard listing from the premium listing.

    Environment

    I’ve mainly been talking about adding variation to a system to allow for a range of content tones, but you may also need your system to scale based on environmental contexts. “Environment” in this context asks: Where will your products be used? Will that have an impact on the experience? Environments are the various constraints and pressures that surround and inform an experience. That can include lighting, ambient noise, passive or active engagement, expected focus level, or devices.

    Shopify’s Polaris design system initially grew out of Shopify’s Store Management product. When the Shopify Retail team kicked off a project to design the next generation point-of-sale (POS) system, they realized that the patterns in Polaris didn’t exactly fit their needs. The POS system needed to work well in a retail space, often under bright lighting. The app needed to be used at arm’s length, twenty-four to thirty-six inches away from the merchant. And unlike the core admin, where the primary interaction is between the merchant and the UI, merchants using the POS system needed to prioritize their interactions with their customers instead of the UI. The Retail team wanted merchants to achieve an “eyes-closed” level of mastery over the UI so they could maintain eye contact with their customers.

    The Retail team decided that the existing color palette, which only worked on a light background, would not be clear enough under the bright lights of a retail shop. The type scale was also too small to be used at arm’s length. And in order for merchants to use the POS system without breaking eye contact with customers, the buttons and other UI elements would need to be much larger.

    The Retail team recognized that the current design system didn’t support a variety of environmental scenarios. But after talking with the Polaris team, they realized that other teams would benefit from the solutions they created. The Warehouse team, for example, was also developing an app that needed to be used at arm’s length under bright lights. This work inspired the Polaris team to create a dark mode for the system (Fig 4.10).

    Comparison of light and dark modes for navigation menus in the Polaris design system.
    Fig 4.10: Polaris light mode (left) and dark mode (right).

    This feedback loop between product team and design system team is a great example of how to build the right variation into your system. Build your system around helping your users navigate your product more clearly and serving content needs and you’ll unlock scalable expression.

  • Request with Intent: Caching Strategies in the Age of PWAs

    Once upon a time, we relied on browsers to handle caching for us; as developers in those days, we had very little control. But then came Progressive Web Apps (PWAs), Service Workers, and the Cache API—and suddenly we have expansive power over what gets put in the cache and how it gets put there. We can now cache everything we want to… and therein lies a potential problem.

    Media files—especially images—make up the bulk of average page weight these days, and it’s getting worse. In order to improve performance, it’s tempting to cache as much of this content as possible, but should we? In most cases, no. Even with all this newfangled technology at our fingertips, great performance still hinges on a simple rule: request only what you need and make each request as small as possible.

    To provide the best possible experience for our users without abusing their network connection or their hard drive, it’s time to put a spin on some classic best practices, experiment with media caching strategies, and play around with a few Cache API tricks that Service Workers have hidden up their sleeves.

    Best intentions

    All those lessons we learned optimizing web pages for dial-up became super-useful again when mobile took off, and they continue to be applicable in the work we do for a global audience today. Unreliable or high latency network connections are still the norm in many parts of the world, reminding us that it’s never safe to assume a technical baseline lifts evenly or in sync with its corresponding cutting edge. And that’s the thing about performance best practices: history has borne out that approaches that are good for performance now will continue being good for performance in the future.

    Before the advent of Service Workers, we could provide some instructions to browsers with respect to how long they should cache a particular resource, but that was about it. Documents and assets downloaded to a user’s machine would be dropped into a directory on their hard drive. When the browser assembled a request for a particular document or asset, it would peek in the cache first to see if it already had what it needed to possibly avoid hitting the network.

    We have considerably more control over network requests and the cache these days, but that doesn’t excuse us from being thoughtful about the resources on our web pages.

    Request only what you need

    As I mentioned, the web today is lousy with media. Images and videos have become a dominant means of communication. They may convert well when it comes to sales and marketing, but they are hardly performant when it comes to download and rendering speed. With this in mind, each and every image (and video, etc.) should have to fight for its place on the page. 

    A few years back, a recipe of mine was included in a newspaper story on cooking with spirits (alcohol, not ghosts). I don’t subscribe to the print version of that paper, so when the article came out I went to the site to take a look at how it turned out. During a recent redesign, the site had decided to load all articles into a nearly full-screen modal viewbox layered on top of their homepage. This meant requesting the article required requests for all of the assets associated with the article page plus all the contents and assets for the homepage. Oh, and the homepage had video ads—plural. And, yes, they auto-played.

    I popped open DevTools and discovered the page had blown past 15 MB in page weight. Tim Kadlec had recently launched What Does My Site Cost?, so I decided to check out the damage. Turns out that the actual cost to view that page for the average US-based user was more than the cost of the print version of that day’s newspaper. That’s just messed up.

    Sure, I could blame the folks who built the site for doing their readers such a disservice, but the reality is that none of us go to work with the goal of worsening our users’ experiences. This could happen to any of us. We could spend days scrutinizing the performance of a page only to have some committee decide to set that carefully crafted page atop a Times Square of auto-playing video ads. Imagine how much worse things would be if we were stacking two abysmally-performing pages on top of each other!

    Media can be great for drawing attention when competition is high (e.g., on the homepage of a newspaper), but when you want readers to focus on a single task (e.g., reading the actual article), its value can drop from important to “nice to have.” Yes, studies have shown that images excel at drawing eyeballs, but once a visitor is on the article page, no one cares; we’re just making it take longer to download and more expensive to access. The situation only gets worse as we shove more media into the page. 

    We must do everything in our power to reduce the weight of our pages, so avoid requests for things that don’t add value. For starters, if you’re writing an article about a data breach, resist the urge to include that ridiculous stock photo of some random dude in a hoodie typing on a computer in a very dark room.

    Request the smallest file you can

    Now that we’ve taken stock of what we do need to include, we must ask ourselves a critical question: How can we deliver it in the fastest way possible? This can be as simple as choosing the most appropriate image format for the content presented (and optimizing the heck out of it) or as complex as recreating assets entirely (for example, if switching from raster to vector imagery would be more efficient).

    Offer alternate formats

    When it comes to image formats, we don’t have to choose between performance and reach anymore. We can provide multiple options and let the browser decide which one to use, based on what it can handle.

    You can accomplish this by offering multiple sources within a picture or video element. Start by creating multiple formats of the media asset. For example, with WebP and JPG, it’s likely that the WebP will have a smaller file size than the JPG (but check to make sure). With those alternate sources, you can drop them into a picture like this:

    <picture>
      <source srcset="my.webp" type="image/webp">
      <img src="/my.jpg" alt="Descriptive text about the picture.">
    </picture>

    Browsers that recognize the picture element will check the source element before making a decision about which image to request. If the browser supports the MIME type “image/webp,” it will kick off a request for the WebP format image. If not (or if the browser doesn’t recognize picture), it will request the JPG. 

    The nice thing about this approach is that you’re serving the smallest image possible to the user without having to resort to any sort of JavaScript hackery.

    You can take the same approach with video files:

    <video controls>
      <source src="/my.webm" type="video/webm">
      <source src="/my.mp4" type="video/mp4">
      <p>Your browser doesn’t support native video playback,
        but you can <a href="/my.mp4" download>download</a>
        this video instead.</p>
    </video>

    Browsers that support WebM will request the first source, whereas browsers that don’t—but do understand MP4 videos—will request the second one. Browsers that don’t support the video element will fall back to the paragraph about downloading the file.

    The order of your source elements matters. Browsers will choose the first usable source, so if you specify an optimized alternative format after a more widely compatible one, the alternative format may never get picked up.  

    Depending on your situation, you might consider bypassing this markup-based approach and handle things on the server instead. For example, if a JPG is being requested and the browser supports WebP (which is indicated in the Accept header), there’s nothing stopping you from replying with a WebP version of the resource. In fact, some CDN services—Cloudinary, for instance—come with this sort of functionality right out of the box.

    Offer different sizes

    Formats aside, you may want to deliver alternate image sizes optimized for the current size of the browser’s viewport. After all, there’s no point loading an image that’s 3–4 times larger than the screen rendering it; that’s just wasting bandwidth. This is where responsive images come in.

    Here’s an example:

    <img src="/medium.jpg"
      srcset="small.jpg 256w,
        medium.jpg 512w,
        large.jpg 1024w"
      sizes="(min-width: 30em) 30em, 100vw"
      alt="Descriptive text about the picture.">

    There’s a lot going on in this super-charged img element, so I’ll break it down:

    • This img offers three size options for a given JPG: 256 px wide (small.jpg), 512 px wide (medium.jpg), and 1024 px wide (large.jpg). These are provided in the srcset attribute with corresponding width descriptors.
    • The src defines a default image source, which acts as a fallback for browsers that don’t support srcset. Your choice for the default image will likely depend on the context and general usage patterns. Often I’d recommend the smallest image be the default, but if the majority of your traffic is on older desktop browsers, you might want to go with the medium-sized image.
    • The sizes attribute is a presentational hint that informs the browser how the image will be rendered in different scenarios (its extrinsic size) once CSS has been applied. This particular example says that the image will be the full width of the viewport (100vw) until the viewport reaches 30 em in width (min-width: 30em), at which point the image will be 30 em wide. You can make the sizes value as complicated or as simple as you want; omitting it causes browsers to use the default value of 100vw.

    You can even combine this approach with alternate formats and crops within a single picture. 🤯

    All of this is to say that you have a number of tools at your disposal for delivering fast-loading media, so use them!

    Defer requests (when possible)

    Years ago, Internet Explorer 11 introduced a new attribute that enabled developers to de-prioritize specific img elements to speed up page rendering: lazyload. That attribute never went anywhere, standards-wise, but it was a solid attempt to defer image loading until images are in view (or close to it) without having to involve JavaScript.

    There have been countless JavaScript-based implementations of lazy loading images since then, but recently Google also took a stab at a more declarative approach, using a different attribute: loading.

    The loading attribute supports three values (“auto,” “lazy,” and “eager”) to define how a resource should be brought in. For our purposes, the “lazy” value is the most interesting because it defers loading the resource until it reaches a calculated distance from the viewport.

    Adding that into the mix…

    <img src="/medium.jpg"
      srcset="small.jpg 256w,
        medium.jpg 512w,
        large.jpg 1024w"
      sizes="(min-width: 30em) 30em, 100vw"
      loading="lazy"
      alt="Descriptive text about the picture.">

    This attribute offers a bit of a performance boost in Chromium-based browsers. Hopefully it will become a standard and get picked up by other browsers in the future, but in the meantime there’s no harm in including it because browsers that don’t understand the attribute will simply ignore it.

    This approach complements a media prioritization strategy really well, but before I get to that, I want to take a closer look at Service Workers.

    Manipulate requests in a Service Worker

    Service Workers are a special type of Web Worker with the ability to intercept, modify, and respond to all network requests via the Fetch API. They also have access to the Cache API, as well as other asynchronous client-side data stores like IndexedDB for resource storage.

    When a Service Worker is installed, you can hook into that event and prime the cache with resources you want to use later. Many folks use this opportunity to squirrel away copies of global assets, including styles, scripts, logos, and the like, but you can also use it to cache images for use when network requests fail.

    Keep a fallback image in your back pocket

    Assuming you want to use a fallback in more than one networking recipe, you can set up a named function that will respond with that resource:

    function respondWithFallbackImage() {
      return caches.match( "/i/fallbacks/offline.svg" );
    }

    Then, within a fetch event handler, you can use that function to provide that fallback image when requests for images fail at the network:

    self.addEventListener( "fetch", event => {
      const request = event.request;
      if ( request.headers.get("Accept").includes("image") ) {
        event.respondWith(
          return fetch( request, { mode: 'no-cors' } )
            .then( response => {
              return response;
            })
            .catch(
              respondWithFallbackImage
            );
        );
      }
    });

    When the network is available, users get the expected behavior:

    Screenshot of a component showing a series of user profile images of users who have liked something
    Social media avatars are rendered as expected when the network is available.

    But when the network is interrupted, images will be swapped automatically for a fallback, and the user experience is still acceptable:

    Screenshot showing a series of identical generic user images in place of the individual ones which have not loaded
    A generic fallback avatar is rendered when the network is unavailable.

    On the surface, this approach may not seem all that helpful in terms of performance since you’ve essentially added an additional image download into the mix. With this system in place, however, some pretty amazing opportunities open up to you.

    Respect a user’s choice to save data

    Some users reduce their data consumption by entering a “lite” mode or turning on a “data saver” feature. When this happens, browsers will often send a Save-Data header with their network requests. 

    Within your Service Worker, you can look for this header and adjust your responses accordingly. First, you look for the header:

    let save_data = false;
    if ( 'connection' in navigator ) {
      save_data = navigator.connection.saveData;
    }

    Then, within your fetch handler for images, you might choose to preemptively respond with the fallback image instead of going to the network at all:

    self.addEventListener( "fetch", event => {
      const request = event.request;
      if ( request.headers.get("Accept").includes("image") ) {
        event.respondWith(
          if ( save_data ) {
            return respondWithFallbackImage();
          }
          // code you saw previously
        );
      }
    });

    You could even take this a step further and tune respondWithFallbackImage() to provide alternate images based on what the original request was for. To do that you’d define several fallbacks globally in the Service Worker:

    const fallback_avatar = "/i/fallbacks/avatar.svg",
          fallback_image = "/i/fallbacks/image.svg";

    Both of those files should then be cached during the Service Worker install event:

    return cache.addAll( [
      fallback_avatar,
      fallback_image
    ]);

    Finally, within respondWithFallbackImage() you could serve up the appropriate image based on the URL being fetched. In my site, the avatars are pulled from Webmention.io, so I test for that.

    function respondWithFallbackImage( url ) {
      const image = avatars.test( /webmention\.io/ ) ? fallback_avatar
                                                     : fallback_image;
      return caches.match( image );
    }

    With that change, I’ll need to update the fetch handler to pass in request.url as an argument to respondWithFallbackImage(). Once that’s done, when the network gets interrupted I end up seeing something like this:

    Screenshot showing a blog comment with a generic user profile image and image placeholder where the network could not load the actual images
    A webmention that contains both an avatar and an embedded image will render with two different fallbacks when the Save-Data header is present.

    Next, we need to establish some general guidelines for handling media assets—based on the situation, of course.

    The caching strategy: prioritize certain media

    In my experience, media—especially images—on the web tend to fall into three categories of necessity. At one end of the spectrum are elements that don’t add meaningful value. At the other end of the spectrum are critical assets that do add value, such as charts and graphs that are essential to understanding the surrounding content. Somewhere in the middle are what I would call “nice-to-have” media. They do add value to the core experience of a page but are not critical to understanding the content.

    If you consider your media with this division in mind, you can establish some general guidelines for handling each, based on the situation. In other words, a caching strategy.

    Media loading strategy, broken down by how critical an asset is to understanding an interface
    Media category Fast connection Save-Data Slow connection No network
    Critical Load media Replace with placeholder
    Nice-to-have Load media Replace with placeholder
    Non-critical Remove from content entirely

    When it comes to disambiguating the critical from the nice-to-have, it’s helpful to have those resources organized into separate directories (or similar). That way we can add some logic into the Service Worker that can help it decide which is which. For example, on my own personal site, critical images are either self-hosted or come from the website for my book. Knowing that, I can write regular expressions that match those domains:

    const high_priority = [
        /aaron\-gustafson\.com/,
        /adaptivewebdesign\.info/
      ];

    With that high_priority variable defined, I can create a function that will let me know if a given image request (for example) is a high priority request or not:

    function isHighPriority( url ) {
      // how many high priority links are we dealing with?
      let i = high_priority.length;
      // loop through each
      while ( i-- ) {
        // does the request URL match this regular expression?
        if ( high_priority[i].test( url ) ) {
          // yes, it’s a high priority request
          return true;
        }
      }
      // no matches, not high priority
      return false;
    }

    Adding support for prioritizing media requests only requires adding a new conditional into the fetch event handler, like we did with Save-Data. Your specific recipe for network and cache handling will likely differ, but here was how I chose to mix in this logic within image requests:

    // Check the cache first
      // Return the cached image if we have one
      // If the image is not in the cache, continue
    
    // Is this image high priority?
    if ( isHighPriority( url ) ) {
    
      // Fetch the image
        // If the fetch succeeds, save a copy in the cache
        // If not, respond with an "offline" placeholder
    
    // Not high priority
    } else {
    
      // Should I save data?
      if ( save_data ) {
    
        // Respond with a "saving data" placeholder
    
      // Not saving data
      } else {
    
        // Fetch the image
          // If the fetch succeeds, save a copy in the cache
          // If not, respond with an "offline" placeholder
      }
    }

    We can apply this prioritized approach to many kinds of assets. We could even use it to control which pages are served cache-first vs. network-first.

    Keep the cache tidy

    The  ability to control which resources are cached to disk is a huge opportunity, but it also carries with it an equally huge responsibility not to abuse it.

    Every caching strategy is likely to differ, at least a little bit. If we’re publishing a book online, for instance, it might make sense to cache all of the chapters, images, etc. for offline viewing. There’s a fixed amount of content and—assuming there aren’t a ton of heavy images and videos—users will benefit from not having to download each chapter separately.

    On a news site, however, caching every article and photo will quickly fill up our users’ hard drives. If a site offers an indeterminate number of pages and assets, it’s critical to have a caching strategy that puts hard limits on how many resources we’re caching to disk. 

    One way to do this is to create several different blocks associated with caching different forms of content. The more ephemeral content caches can have strict limits around how many items can be stored. Sure, we’ll still be bound to the storage limits of the device, but do we really want our website to take up 2 GB of someone’s hard drive?

    Here’s an example, again from my own site:

    const sw_caches = {
      static: {
        name: `${version}static`
      },
      images: {
        name: `${version}images`,
        limit: 75
      },
      pages: {
        name: `${version}pages`,
        limit: 5
      },
      other: {
        name: `${version}other`,
        limit: 50
      }
    }

    Here I’ve defined several caches, each with a name used for addressing it in the Cache API and a version prefix. The version is defined elsewhere in the Service Worker, and allows me to purge all caches at once if necessary.

    With the exception of the static cache, which is used for static assets, every cache has a limit to the number of items that may be stored. I only cache the most recent 5 pages someone has visited, for instance. Images are limited to the most recent 75, and so on. This is an approach that Jeremy Keith outlines in his fantastic book Going Offline (which you should really read if you haven’t already—here’s a sample).

    With these cache definitions in place, I can clean up my caches periodically and prune the oldest items. Here’s Jeremy’s recommended code for this approach:

    function trimCache(cacheName, maxItems) {
      // Open the cache
      caches.open(cacheName)
      .then( cache => {
        // Get the keys and count them
        cache.keys()
        .then(keys => {
          // Do we have more than we should?
          if (keys.length > maxItems) {
            // Delete the oldest item and run trim again
            cache.delete(keys[0])
            .then( () => {
              trimCache(cacheName, maxItems)
            });
          }
        });
      });
    }

    We can trigger this code to run whenever a new page loads. By running it in the Service Worker, it runs in a separate thread and won’t drag down the site’s responsiveness. We trigger it by posting a message (using postMessage()) to the Service Worker from the main JavaScript thread:

    // First check to see if you have an active service worker
    if ( navigator.serviceWorker.controller ) {
      // Then add an event listener
      window.addEventListener( "load", function(){
        // Tell the service worker to clean up
        navigator.serviceWorker.controller.postMessage( "clean up" );
      });
    }

    The final step in wiring it all up is setting up the Service Worker to receive the message:

    addEventListener("message", messageEvent => {
      if (messageEvent.data == "clean up") {
        // loop though the caches
        for ( let key in sw_caches ) {
          // if the cache has a limit
          if ( sw_caches[key].limit !== undefined ) {
            // trim it to that limit
            trimCache( sw_caches[key].name, sw_caches[key].limit );
          }
        }
      }
    });

    Here, the Service Worker listens for inbound messages and responds to the “clean up” request by running trimCache() on each of the cache buckets with a defined limit.

    This approach is by no means elegant, but it works. It would be far better to make decisions about purging cached responses based on how frequently each item is accessed and/or how much room it takes up on disk. (Removing cached items based purely on when they were cached isn’t nearly as useful.) Sadly, we don’t have that level of detail when it comes to inspecting the caches…yet. I’m actually working to address this limitation in the Cache API right now.

    Your users always come first

    The technologies underlying Progressive Web Apps are continuing to mature, but even if you aren’t interested in turning your site into a PWA, there’s so much you can do today to improve your users’ experiences when it comes to media. And, as with every other form of inclusive design, it starts with centering on your users who are most at risk of having an awful experience.

    Draw distinctions between critical, nice-to-have, and superfluous media. Remove the cruft, then optimize the bejeezus out of each remaining asset. Serve your media in multiple formats and sizes, prioritizing the smallest versions first to make the most of high latency and slow connections. If your users say they want to save data, respect that and have a fallback plan in place. Cache wisely and with the utmost respect for your users’ disk space. And, finally, audit your caching strategies regularly—especially when it comes to large media files.Follow these guidelines, and every one of your users—from folks rocking a JioPhone on a rural mobile network in India to people on a high-end gaming laptop wired to a 10 Gbps fiber line in Silicon Valley—will thank you.

Search Engine Watch
Keep updated with major stories about search engine marketing and search engines as published by Search Engine Watch.
Search Engine Watch
ClickZ News
Breaking news, information, and analysis.
PCWorld
CNN.com - RSS Channel - App Tech Section
CNN.com delivers up-to-the-minute news and information on the latest top stories, weather, entertainment, politics and more.
CNN.com - RSS Channel - App Tech Section
 

Ако решите, че "как се прави сайт" ръководството може да бъде полезно и за други хора, моля гласувайте за сайта:

+добави в любими.ком Елате в .: BGtop.net :. Топ класацията на българските сайтове и гласувайте за този сайт!!!

Ако желаете да оставите коментар към статията, трябва да се регистрирате.