Tuesday afternoon session @ TCDL
Posted by Rachel on May 26, 2009
There were lots of little discussions that took place throughout the afternoon. People talked from Texas A&M, University of North Texas, University of Texas in Austin, and the Texas State Library & Archives Commission. Here are some of the topics of discussion, and random notes. A lot of it was a bit over my head, but I generally got the concepts and have some ideas to bring back to Houston to share with the Web Services department.
- Service institutions <– OAI-PMH –> Data providers <–> OAI-ORE
- Collapsible / Expandable list - used JQuery, Manakin on DSpace
- Communities can contain other communities and collections. Collections can only contain items. What is appropriate to share with public?
- Similarity search in DSpace? What does Lucene index? More than just Dublin Core?
- You need good quality metadata to drive facets.
- Pair Tree - a specification for a unique identifier, taking characters and putting them into pair chunks. It’s a microservice for repositories, written by the California Digital Library folks.
- Solr parse
- Perlbal and Jango
- Open Layers (a javascript library). Typically used in maps. Can work with displaying images and lets you view them in cool ways.
______________
Overall, the rest of the conferece was quite good. UT Austin and the Texas Digital Library are all very generous in providing the facility, food, drinks, and even a leather binder and thumb drive, all at no cost to participants. The sessions were mostly very good, and the keynote speakers were inspirational. I unfortunately didn’t take notes on a laptop for the remainder of the conference because the seating didn’t really accomodate much room for laptops. (More reasons for me to get a netbook!!!!) Anyway, Mingyu Chen, Michele Reilly and I learned a lot and came back to Houston with a lot of ideas. I think we are also interested in taking some road trips, including Baylor University Libraries out in Waco, TX. They have some pretty cool things going on there in their digital library.
Developers’ Forum & Unconference @ TCDL
Posted by Rachel on May 26, 2009
Here I am in Austin at the University of Texas for the Texas Conference on Digital Libraries. Today there are 2 preconferences, and tomorrow the actual conference starts. Today started of f with some introductions around the room. People are from all over Texas, both public and academic libraries, as well as state libraries. There was even someone from a school district and a library school student.
It seems that many institutions are using Dspace with Manakin, and some are using ContentDM. Some use Rails or Jango to support them.
There was much talk about this session being like a code4lib conference, and if it’s successful, this might turn into more of a regularly scheduled event, like a Texas chapter of the code4lib.
Linked Data - Not just some file to download, but it’s actually part of the web
Tim Berners-Lee’s ideas on linked data: “The Semantic Web isn’t just about putting data on the web. It is about making links, so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related, data.” (http://www.w3.org/DesignIssues/LinkedData.html)
- Use URIs as names for things
- Use HTTP URIs
- Provide useful information
- Include links to other URIs
LCCN - clean URIs
rel=alternate and resource URI –> how do we make these relationships?
Why make the relationships explicit at the presentation layer? It becomes crawl-able and mine-able. We can get integrated access applications just by working through the smart crawl / index. That’s a good thing.
Recall improves explicit links through authority records. Doing this also helps the precision / recall decisions of the clunky OPACs on the open web. Overall, this does web stuff better. And weare still only just talking about utiling HTML / HTTP.
It’s not about making it fit the web - it IS the web.
Places using clean linked data using LCCNs.
The value of having linked data…
- There is lots of stuff out there.
- Stuff goes away.
- No one is interested in 404 data, so you need resilient data
HOW TO LINK YOUR DATA
- Create clean, cool URIs for your data concepts
do this: /subject/lccn/n81072976 instead of a path with the long query strands
easy to rewrite tomorrow if needed - Get alternate data views (concept record)
Use content negotiation and link rel=
Offer up MARC, JSON, RDF, or whatever is next - Relate your data to established authorities
Your search results will be backed by shared authority records
HOW TO MAKE IT LAST
If your site breaks when links break, cache and link yourself. You can cross reference your authority records, but use the cache concept (in case the URL goes away, you still have the concept and name of source). This is just like what we currently do in our ILS/OPACs (but in a web way). It is also a way to proxy the concept - people know they are the same thing. But the concept is in YOUR app.
dvcs lesson: No central server is better. Make every cache its own linked data source. If one goes down, the others live on. If others live on, it’s all still linked. If it’s all still linked, it lives on.
Innovation Ideas from Helene
Posted by Rachel on May 22, 2009
I attended Helene Blowers’ webinar called “Innovation Starts with I“. Helene blogs at Library Bytes and always has innovative things to say. I really found her presentation inspiring, and I’m glad that several librarians and even one of our Associate Deans attended the presentation. After Helene’s presentation, we had a lively hour-long discussion about our institution’s support, or lack thereof, for innovation. And I would guess that’s probably one of Helene’s goals in giving these presentations. It’s a great way to start a conversation, evaluate, share, learn, and think differently. And this is why I find it really valuable to host these types of webinars.
Anyway, here are my notes from the presentation:
Innovation IS NOT:
strategic planning, research and development, ideas, benchmarking, technology, change agentry, suggestion box
Innovation isn’t about best practices. It’s about freshness.
Innovation IS:
DOING. NEW. THINGS.
The Seeds of Innovation, by Elaine Dundon
3 Types of Innovation
- Efficiency
- Evolutionary
- Revolutionary
Evolutionary Innovation = creating something distinctly new and better
Evolutionary Innovation = radically changes business and culture
4 Components of Innovation
- Creativity
- Strategy
- Implementation
- Profitability
<Creativity>
Tips on generating new ideas
- Don’t focus on quality, but on quantity first
- Become a collector and collect everything
- Get outside your comfort zone
- Bounce your thoughts around
- Add constraints to your thinking (they can be liberating!)
- Write stuff down (keep an idea journal)
<Strategy>
It’s about how you become a change agent and get buy in. You have to be able to sell to your colleagues and the library administration.
- Make it believable. Address the mission, the vision, and make sure it adds value
- Create alliances and new relationships
- Don’t ask for permission, ask for forgiveness (or at least ask for support)
- Sell your vision personally (not in email or in paper)
- Find a champion
- Analyze the risk (not just for you, but also for the organization)
- Do your leg work (do the background work ahead of time)
- Build upon small successes
<Implementation>
This is all about doing it. Project management skills are crucial.
<Profitability>
This can be measured by outcomes and outputs.
___
Supporting Innovation (what admin can do to support staff)
- Set the strategies
- Be comfortable with risk. NO RISK = risky business
- Set the bar. You want pole vaulters, not limbo dancers.
- Encourage cliff jumping (risk takers)
- Provide a safety net
- Make failure an expectation
- Provide time for exploration
- Flip your organization structure. Make the ground fertile so your staff can grow.
- Beware of glory grabbers - people who aren’t about teamwork but are for themselves.
- Listen for what you are not hearing.
- Lead by example. Exemplify that risk taking philosophy
There are lots of innovation books out there. They are all really about leadership.
Innovation distinguishes between a leader and a follower. - Steve Jobs.
Pay attention to the differences between management and leadership.
Management is transactional. Leadership is transformational.
Some Final Thoughts…
Innovation is iterative. You need to keep the innovation continuously flowing. It can get messy because of all the risk and other things involved, but it’s worth it. Innovation is about creating new patterns. It’s curiosity at its best. It’s also in a perpetual beta state. It’s always adapting and changing to your environment. Take risk, and tip the scales. You also need to learn how to let go. Trust your staff.
Innovation starts with YOU.
Getting analytical with Google Analytics
Posted by Rachel on Apr 24, 2009
Today I attended ARL’s Google Analytics Workshop. Jonathan from LunaMetrics was running the workshop, and he was an excellent presenter. Thorough, thoughtful, quick-paced, and interesting. His company’s blog can be found at http://www.lunametrics.com/blog
Web analytics measure your site: every page, almost every click, when people come/go, how they get there, often often, what they download, repeat visitors, and did they achieve their goals
Did people do a search, and then get to what they were searching for?
So many people put effort into data gathering, but really don’t follow through with the analysis and then the taking of action to make changes. It is the role of the Web Analyst to understand what is and isn’t important. Put it into context for other people in the organization.
What kind of site are we? Think about your library.
- Library site (informational) –> Success is that they showed up!
- Library site (portal to services) –> Success is that they helped themselves!
- Digital Library site –> Success is that they browsed and also downloaded.
- OPAC –> Success is where they searched and viewed records
Compete.com. It might be helpful to compare our site to other library sites. There is a tool within Google that apparently lets you do this too.
Stats tell us many things, but it doesn’t tell us if they were happy. 4Q is a easy survey tool - it asks if people were successful, what they came to the site for, etc.
Do we measure our RSS Feeds? Take a look at Feedburner.
Google has Site Overlay, but it sucks. Try CrazyEgg or ClickTale. (not free, but has free trials)
Now on to Google Analytics….
Don’t just cut and paste the javascript. Think aout multiple domains, sub-domains, redirects, frames, URLs with lots of query parameter junk, AJAX, Flash, non-HTML files like PDF, MP3, etc..
No analytics software is 100% accurate (for many reasons, some of which are unavoidable). Don’t freak out about decimal points. Look more at trends over times and comparisons between groups and over time.
Most of the analytic tools are log based and server side. More are client side, but both are hard to get actuate data. You have to deal with cookies, robots and crawlers dealing with Javascript, dealing with IP addresses (both static and dynamic), proxies, and tracking file types.
If you want to maintain continuity with historic data? Run tools side-by-side for a while, and benchmark one against the other.
Urchin Software from Google analyzes traffic for one or more websites and provides easy-to-understand reports on your visitors - where they come from, how they use your site, what converts them into customers, and much more.
On a side note, Google has recently announced an API for Google Analytics
Currently, you can’t get Feedburner and Custom Search Engine data out of Google Analytics. You also aren’t allowed to use Google Analytics to track or collect personally identifiable information of Internet users. You also most post a privacy polity that states that you are collecting this data.
When comparing dates to the past, line up days, not months.
Bounce Rate: 30% or lower is good. 50% or higher is bad. However, something like hours, blog posts, would have higher bounce, and that’s ok.
Sooooo… If your site is a portal to lots of thrid parties, you want to know what they are clicking on since the bounce rate will be high and they are still getting what they want (that is, they are clicking on a database that is hosted externally).
There are also lots of GreaseMonkey scripts that can be added on to Google Analytics. Here is an article on it, but we can also email Jonathan about it because he has done a presentation on it previously.
He showed some goals for his company in Google Analytics. Examples:
- Contact us form submitted
- Purchase GA Training
- Blog Subscription signup
- Clicked on a mailto: link
Per visit Goal Value…. He suggests putting a dollar value on the visit, even if we don’t deal with money. Basically, you set up goals and then decide the “dollar value” they are worth.
While in Google Analytics, the search at the bottom when looking at content takes regular expressions as well as wildcards.
Under “All Traffic Sources”, the medium is sort of the bird’s eye view of your site.
You can also have things sent to people regularly… they don’t have to even come to the site. Reports can be sent monthly, weekly, etc. It’s for read-only users.
We could send it to a feed, and then use Yahoo Pipes to post to the Intranet.
There are three main traffic sources (how people came to you):
- Direct Traffic
- Referring sites
- Search Engines
Campaign tracking - it’s a way to follow specific marketing events that are normally hard to track. You are trying to get people to come to your page, and this tracks the campaigns.
Content
- Top Content –> examine content pages by URL
- Content by Title –> more human-readable of what the page is, unless you have multiple pages with the same title (you can also use a greasemonkey script to display whole title within GA, even though it gets cut off here, but the whole title is still included in the export)
- Content Drilldown –> arranges your URLs by subdirectory
Look at High bounce rates on the Top Landing Pages. It will help you determine how to fix some of your problems.
Goals
Goal conversions are the primary metric for measuring how well your site fulfills business objectives. A goal is a website page which a visitor reaches once they have made a purchase or completed another desired action, such as a registration or download.
Reverse Goal Path tells you how people got to their goal. say you have a Contact Us on every page on your site. This tool will tell you what page people came from to get there.
Goal Funnel will help you examine multi-step goals. For example…. It can show you Main event page –> Registration Page –> Purchase Page. It also shows you a percentage of people who went from step one to step two, and to step three.
Advanced Segments - basically a query tool, like custom reports.
Site Search
Site Search (under Content) can be used with any type of search engine that you are using on your site.
You can create a custom site search that for, example, will tell you how many people are using various tools, leaving the site, finding the right tool, etc.
You can also set it up where you look at a page and see what terms people are typing in and getting to this page. You can reversely look up a term/phrase that people are typing in and see what pages they are landing on. This can help you determine if people are successful or not.
“Visualize” is a feture under Site Search (at the top near export) and it makes a visualization of the data. You can color bubbles, resize them, label them, and alter what you have on the X & Y axis. This is more of a sexy tool and not all that useful. However, it can help you look over time (as you can play a video).
Event Tracking
You can track non-pageview events, like interaction with flash, an AJAX widget, video players, downloads of files, pdfs, clicks on outbound links, and pretty much anything where you can use JavaScript
How is this different from using_trackPageview? It doesn’t record a pageview. You can’t use an event for a goal, and you can use categories and values to collect extra data.
On to the libraries…
What do we need to pay attention to? It’s all about Key Performance Indicators- both big strategic ones. There also needs to be tactical ones –> must be actionable, and different KPIs will be more appropriate for different people in the library.
Visitors: staff, inside the physical library, inside the organization, from home (filter by IP address). But then you also have to worry about VPN users - where do they fit?
Bounce rate - always look at it within the context. It’s not always a bad thing to have a high bounce rate. They come, see one page, and then leave.
Exit rate - people come, view multiple pages, and then from anywhere.
Library site as Informational Site
Site Search - are people finding what they are looking for? If they aren’t finding what they are looking for, then why? Does the info no exist? Are the terms users are searching on different than what is on the site? Is what they are looking for buried? Are users doing searches on things that are buried and that should be on the top level page?
Library site as a Portal
Look at all the links on a page that go to a different domain - exclude those from your bounce rate.
You can use “onclick” to help with tracking
<a href="http://www.example.com/link" onlick="pageTracker._trackPageview('/libraryservices/link')">Link</a>
IF you have two different sites, should you track them separately or together? If you start at one place and go to the other, it’s probably best to track them together.
Digital Libraries
How do people get there? How do they originally find out about it? It’s really important to know how people get there. So you can look at where they came from, and you can look at where they landed.
You could create a filter of traffic coming from social media sites if you want… It’s also important for digital libraries to look at referring sites. Look at keywords people are using to find the page.
Also, did the users stick around? Look at the bounce rate and how long those users stayed at the site.
Learn how to use branded and non-branded keywords in your filtering.
Search and Referral Visitors
- Fine tune your SEO to draw traffic on relevant keywords
- Find out which sites drive valuable, relevant referral traffic and encourage them to link to you more
- Improve your lagging landing pages, or adjust keywords to target the right landing pages
Make sure you know what your goals are, and follow and track them. Are they downloading? Uploading? Registering? Signing in?
Really be sure to look at your Goal Funnels. Look for bottlenecks and leaks. Try reducing the number of steps, changing the steps, or even redefining the funnel.
Redux
- on-site search
- exit rate for internal navigation pages
- design
OPACs
- If it’s in a query parameter, you can track it.
- Categories are your friend! are users searching by title, author, keyword, etc?
- You could also use this on things like faceted navigation or other features that use query parameters
Possible simple goals: viewed a record
Other goals: used faceted navigation, added to bookbag, requested reserve, wrote a review, added a tag, etc.
Test your pages!
Experiement your website. You can test Version A vs. Version B, and see which one visitors respond to better
Google Website Optimizer - an evidence-based way of looking at and analyzing the content of your site.
Integration, tracking, and campaigns
Integrate email and more with your Google Analytics account!
http://www.example.com = http://www.example.com/?name=rachel
Campaign tracking –> put “campaign code” after the landing page URL
Medium - how people got to the site (referral, direct, organic search, paid search, email, RSS, banner, etc. (add your own)
Use a Campaign for your Marketing Activity
Campaign parameters: medium, source, campaign, term (cost per click advertising), content
Google created a nice little tool to do this: Google Analytics URL Builder. Very cool! You can see where people are coming from in source and medium!
Gotchas in Campaigns
Don’t skip source or medium. Remember & instead of ? in URLs that already have question marks. And be clear on how you are going to name these thiings. Be consistent and remember that case matters. Don’t use spaces.
Likely Goals
A goal is just a URL (like a thank you page for filling out a submission) or some other destination. A funnel is some steps (URLs) that lead up to the goal. You need to know all the steps . You can use a specific page or use regular expressions to create a range of pages.
Goals are specific to each profile, and each profile can only have up to 4 goals.
Goal URLs –> What if my goal URL isn’t unique? What if my goal is a pdf, not a page? What if multiple pages can be the goal? All these can be handled in GA.
Funnels –> Don’t try to make everything a funnel! Home pages should never be a start page in the funnel. Use funnels for linear, well-defined processes, like a checkout or a series of forms.
Assign a goal value with $$$, even if you aren’t using money or have monetary value. And don’t mark steps as required in the funnel. You never know where people are going to be coming in from.
Recommended Books
- Web Analytics: An Hour a Day
- Google Analytics ShortCut (coming soon in a new edition)
Webinar: Mobile Apps for Libraries and the WorldCat Mobile Pilot
Posted by Rachel on Apr 9, 2009
I attended this webinar today: Mobile Applications for Libraries and the WorldCat Mobile Pilot. The presenter was Cindy Cunningham, Director of Partner Programs for OCLC since January 2007, works with non-library companies and entities to develop partnerships that benefit the OCLC cooperative. Before that she worked for Corbis.com, Amazon.com, and at the University of Washington Libraries, Kitsap Regional Library and at the Library of Congress.
Her presentation was quite excellent! Here are my notes and comments:
As of October 2007:
- 1 billion PC’s
- 1.2 Internet Users
- 3.3 Billion mobile phones
Early Adopters
- Ball State University Libraries
- American University
- Boston University
- Harvard College
- University of Virginia (They use transcoding to work with devices for disabled users)
- many more…
What users want:
- recieve a text citation or basic library info from the catalog to your phone
- located the closest library
- put items on hold
- renew items
- reminders about when books are due
More of what they want:
- Credential-less authentication - allows users to click through to databases and ebooks
- citation capturing and wish list creation
- texting a librarian
- being texted with new acquisitions - with a user set-up profile
- integrated circulation information
- ability to checkbooks out without having to go to the desk
The iPhone effect
- Yale University Cushing/Mhitney Medical LIbrary site
- Michingan’s Ann Arbor District Library
- University of Virginia
- Orange County Library District (Florida)
- WorldCat’s iPhone app, powered by Boopsie
Looking at WorldCat Mobile - It’s menu-driven, very fast, works differently than the normal WorldCat. I have it on both my smart phone and on my iPod Touch, and it’s very cool. Sure, it’s still in beta, but the potential is there, and I’m impressed.
Search results come back alphabetical, most current first. You can also just type in the first few letter of each word in the author or title.
It was launched in January of 2009 at the ALA Midwinter Conference. They have had an increase in users.
What to be aware of:
- Mobiling isn’t just about shrinking the screen real estate
- Searching online follows the mobile culture of text-speak. short fragments, fewer key strokes
- Utilize the best of mobile - GPS positioning to understand a user’s preferences, present maps and closest results in appropriate languate preferences
- Offer quick sort options - especially with a large data set
- Scrolling in a mobile environment is not desirable - better to give results in short lists
- It’s hard to provide the resources to support all Nokia, Samsung, Sony, and Smartphones
- Don’t clutter the site with user instructions for signing on and usage
WorldCat is working on creating an iPhone and Android app!
I attended the initial presentation at the 2009 ALA Midwinter session, and I downloaded then and was quite impressed. They will be presenting at ALA Annual in Chicago as well. I encourage that you attend one of the two sessions if you are going to be at ALA Annual.
WiLS Open Source Webinar #2
Posted by Rachel on Nov 11, 2008
This is the 2nd webinar in Open Source Webinars from WiLS. Evette Atkin, a Systems Librarian, at the Michigan Library Consortium, talked about what an open source ILS is and about the Michigan Evergreen Project. The Michigan Evergreen Project Blog http://www.mlcnet.org/evergreen/
What is Evergreen?
According to Wikipedia, “Evergreen is an open source, consortial-quality Integrated Library System (ILS), initially developed by the Georgia Public Library Service for PINES (Public Information Network for Electronic Services), a statewide direct-lending consortium with over 270 member libraries.”
What does open source mean for libraries?
Traditional ILS
- Rent vs. Own
- Out of the box
- Enhancement request
Open Source ILS
- Rent vs. Own
- Share code
- Customize
Why use Evergreen?
- Traditional ILS - You have different levels of pricing and limited flexibility
- Koha - Web based, requires support (lots of third party support)
- Evergreen was created for a large consortium in Georgia - PINES
There are LOTS of other libraries across the country that are going with Evergreen.
Why are they going with Evergreen?
- To provide libraries with an affordable, hosted, open source ILS
- To provide a high functioning system for libraries without a huge tech department
When implementing an open-source ILS, the software itself may be free, but consider the cost of the hardware, the training, the staff, and possible cost-sharing opportunities.
She then proceeded to show Evergreen and all its features. Overall, it was an excellent presentation that gave a nice overview of some open source ILS options and it was good to see Evergreen in action and make comparisons and contrasts to our current ILS.
WiLS Open Source Webinar #1
Posted by Rachel on Nov 4, 2008
Mark Beatty introduced the series, and Casey Bisson kicks off the first webinar with a general discussion on open source software. He mentioned a bit about Scriblio, which is a tool he developed that is based on Wordpress and implemented at Plymouth State University in New Hampshire.
Open Source Software- more efficient way to do business.
Free software improves the efficiency of business.
Companies often spend more money and time developing and selling their software rather than just developing a good product. Open source software just saves money.
Post-Purchase Costs
- Training
- Configuration
- Integration with other services
- Ongoing maintenance
Creative Commons only deals with products, not source code. Share alike is the notion of free software. Casey also talked about Digital Rights Management issues in libraries.
Here’s the PowerPoint version of his slides.
Here’s the PDF version of his slides.
Social Software Showcase a success!
Posted by Rachel on Jul 1, 2008
Well, I survived the BIGWIG Social Software Showcase that took place at ALA in Anaheim this year. I attended last year, and this year I was one of the presenters! I really admire my colleagues for all their cool ideas and enthusiasm. They pulled it off last year, and this year’s attendance and feedback indicated how successful this format of presentation can be.
Unfamiliar with BIGWIG or what the Social Software Showcase is all about? BIGWIG is a complicated acronym which is basically the Blogs and Wikis Interest Group of LITA, the Library and Information Technology Association, a division of ALA, which is the mothership of library organizations. BIGWIG is all about investigating, playing with, and sharing the latest collaborative and communicative technologies and looking at how they can be used effectively in libraries and beyond. I am proud to be a part of this group and excited to be doing my part. Yet I want to do more…
Last year, the Social Software Showcase had about 25 physical attendees. This year, we had 125+ people and even a live video stream of the Showcase via uStream. There was no more room on the floor, at a table, or anywhere in the room. But what makes this thing different than any other program at ALA? Well, typically, the format for most programs at these conferences is someone or a panel speaks, and the audience listens. There might be time for Q&A at the end, but engaging the audience is not always the priority.
BIGWIG feels that people come to a conference to learn and to do networking and interact with other people - things that you can’t as easily do at home. Why fly all the way out to California and sit and listen to someone talk when you can listen to a podcast of his or her talk or watch a video of the presentation at home? The opportunity to interact with those who are presenting - the experts, the innovators, the people who are enthusiastic about technology - is what the Showcase is really about. So if you go to YourBIGWIG.com, you’ll see that all the presentations, both of those present and who weren’t at ALA, are online. The showcase’s live stream was recorded and is available to watch! At the Showcase, we basically just did introductions, spoke for a couple minutes about what our “topic” was on, and lead a table discussion. Attendees were free to walk around from table to table to ask questions or just listen in. There were people who knew a lot and some who were brand new and just wanted to see what we were all about. It was an awesome time. If I couldn’t answer a question, other people at the table usually could.
The format of the Showcase provided an opportunity for people who don’t know much about a particular technology to ask questions in a small group and to feel comfortable doing so. Those people might not speak up in a room of 500 people.
The format also allowed people with similar interests and projects to network. Again, it would be harder to do so in a group of 500.
I hope that as more people come and experience the closeness, the enthusiasm, the innovativeness, and the technologies presented at Showcases in the future, they will take some ideas and possibly even the format back to their interest groups, committees, and divisions. This Showcase proves that presentations can be more engaging, fun, virtual, and interactive. Isn’t that what Library 2.0 is supposed to be about anyway? Why aren’t more people giving presentations about Library 2.0 in a Library 2.0 way? We do this for our users - why not for one another?
My Presentation for the upcoming ALA Conference
Posted by Rachel on Jun 23, 2008
Below you’ll find the presentation (minus my commentary) for the upcoming BIGWIG Social Software Showcase at the ALA Annual Conference in Anaheim, CA. If you want to hear me talk about this topic, you can listen to the presentation on blip.tv. Below is just the visual part of my presentation embedded from Slideshare.
I’d like to thank Lee Hilyer for inspiration on improving my presentation techniques. He blogs and makes the world safe for presentation audiences everywhere at his site Presentations for Librarians.
CIL 2008: Postconference: Web Services for Librarians
Posted by Rachel on Apr 10, 2008
Presenter: Jason Clark, Head of Digital Access & Web Services
The presentation and the handout can be found on his website, http://www.lib.montana.edu/~jason/talks.php
What is an API?
An API is an application programming interface for developers to access parts of a remote web site and integrate it with their own site.
An API allows you to manipulate the data. You can run a query, and let people use your data in other ways.
What is a web service?
- Broader term
- Public interface (API)
- Provides access to data and/or procedures
- On a remote/external system (usually)
- Use structured data for data exchange (often XML)
Find a service you like, and learn how to modify it. Play with it, use it, know it inside and out.
What is structured data?
Structured data = XML and JSON
- Extensible Mark-up Language and Javascript Object
- Notation
- Flexible mark-up languages
- Lightweight and easy to parse
- Allow communication between disparate systems
POST and GET
- You can post and get data from a particular web service
Why USE web services?
- Access to content/data stores you could not otherwise provide (zip codes, news, pictures, reviews, etc.)
- Enhance site with a service that is not feasible for you to provide (maps, search, products, etc.)
- Combine these services into a seamless service you provide (mash-ups)
Why PROVIDE web services?
- You have a service that benefits your users best if they can get to their data from outside the application
- You want others to use your data store in their applications
So… What web services are available? There are tons to choose from. Here are some of the biggies:
- Google (reviews, images, etc.)
- Yahoo!
- Amazon
- eBay
- Flickr
- del.icio.us
- Google App Engine (fairly new!!! It’s a storage service.)
- Amazon s3 (This is also a storage service.)
- Many more…
Some cool APIs
What is SOAP?
- XML files talking to one another.
- An acronym for Simple Object Access Protocol
- Version 1.2 of the W3C recommendation dropped the acronym
- Specification maintained at w3.org
- There’s nothing simple about SOAP!
- Send a message specifying an action to take, including data for the action
- Receive a return value from the action
- Most SOAP services provide a WSDL file to describe the actions provided by the service
What’s WSDL?
- Web Services Description Language
- XML mark-up for describing the functionality provided by a SOAP service
SOAP is complex
- Complex
- Messaging and Data mingled
- Usually seen in software APIs, but many scripting languages have libraries
- Google API has moved away from it
What is XML-RPC?
- XML Remote Procedure Call
- Specification maintained at xmlrpc.com
- Provides a means to call methods/procedures on a remote server and make changes and/or retrieve data
- An early specification
- Most common implementation of XML-RPC used today is that of blog ping services
Technorati, Flickr, FeedBurner, others? - An updating protocol
- Early adoption, but little recent development
What is REST?
- The greatest thing since sliced…
- Representational State Transfer
- Unique data resources with addresses
Theory of REST
- Focus on diversity of resources (nouns), not actions (verbs)
- Every resource is uniquely addressable
- All resources share the same constrained interface for transfer of state (actions)
- Must be stateless, cacheable, and layered
REST = Web Protocol
- Web As Prime Example
- URLs uniquely address resources
- HTTP methods (GET, POST, HEAD, etc.) and content types provide a constrained interface
- All transactions are atomic
- HTTP provides cache control
- Similarity to web - easy to understand
- URL is the method
- Most popular type of web service
“Respect the URL!” - Jason Clark
Web Services in Action
- Scriblio - Casey Bisson build a library catalog within WordPress. It’s being used at Plymouth State University. Here is their description:
- Scriblio (formerly WPopac) is an award winning, free, open source CMS and OPAC with faceted searching and browsing features based on WordPress. Scriblio is a project of Plymouth State University, supported in part by the Andrew W. Mellon Foundation.
- Repository66: mash-up of OpenDOAR data with Google Maps and repository growth charts
- Mashup of Google Maps and repositories
- ROAR, developed by Stuart Lewis of the University of Aberystwyth, Wales
- LibraryThing APIs
- lofiAPI: MSU Libraries (ETD, RMT)
- MSU Library Lifestream: RSS services (Twitter, del.icio.us, last.fm, MSU Library Blog)
Lifestream. You could create a “subject guide” based on what a particular librarian is finding in LibraryThing, YouTube, and other types of web services. Use this to show people what folks in the library are doing!
- TERRApod Youtube admin
- Google Booksearch
Start small, with something simple like RSS before you tackle Amazon or Google APIs.
Getting our hands dirty now! Take a look at some of the examples from class RIGHT HERE (includes code and demos!):
http://www.lib.montana.edu/~jason/files.php
Here are three examples we looked at:
- Google Ajax Search API
- Amazon Reviews & Thumbnails (PHP)
- Flickr API - Display Photos (JSON)
Page with new books. Include images, reviews, etc.
Pull in pics from Flickr.
What Jason learned from playing with all this
- Web services are closed source software
- Documentation and online support is vital
- Debugging can be hard
- Similarities to common protocols are important
- Practice and finding your development kit is essential
His last thoughts…
- This stuff is just beginning
- Worldcat API - IT’S COMING!!!
- Digital Library Federation API recommendation
- Library mashups are coming - there’s just too much good data out there
Overall, this is all REALLY cool stuff. I can’t wait to get back to campus and get down and dirty with some of this stuff! So many ideas are rumbling around in my head. APIs
rock!