These are the news items I've curated in my monitoring of the API space that have some relevance to the API definition conversation and I wanted to include in my research. I'm using all of these links to better understand how the space is testing their APIs, going beyond just monitoring and understand the details of each request and response.06 Aug 2018
As we prepare for APIStrat in Nashville, TN this September 24th through 26th, I asked my partner in crime Audrey Watters (@audreywatters) to write a post on the significance of Virginia Eubanks, the author of Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor keynoting the conference–she shared this story, of why her work is so significant, and why it is important for the API community to tune in.
Repeatable tasks can and should be automated – that’s an assertion that you’ll hear all the time in computing.
Sometimes the rationale is efficiency – it’s cheaper, faster, “labor-saving.” Automation will free up time; it will make our lives easier. Or so we’re told.
Sometimes automation is encouraged in order to eliminate human error or bias.
Increasingly, automation is eliminating human decision-making altogether. And in doing so, let’s be clear, neither bias nor error are removed; rather they are often re-inscribed. Automation – algorithmic decision-making – can obscure error; it can obscure bias.
This push for more automated decision-making works hand-in-hand with the push for more data collection, itself a process that is already shaped by precedent and by politics. And all this, of course, is facilitated by APIs.
APIs are commonly referred to as a “glue” of sorts – the implication, more often than not, is that APIs are simply a neutral technology holding larger technical systems together. But none of this is neutral – not the APIs and not the algorithms and not the databases.
These technologies are never neutral in their design, development, or implementation. The systems that technologies exist in – organizationally, economically, politically, culturally – are never neutral either.
It seems imperative that those building digital technologies begin to think much more critically about the implications of their work, recognizing that the existing inequalities in the analog systems are readily being ported to the digital sphere.
This makes the work of one of the keynote speakers at this fall’s API Strategy and Practice conference so particularly timely: Virginia Eubanks is a political science professor at the University of Albany, SUNY and the author of Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. The book is a powerful work of ethnography, chronicling the ways in which data mining, predictive modeling, and algorithmic decision-making reproduce and even exacerbate inequalities in housing, health care, and social welfare services. “The digital poorhouse” Eubanks calls it.
“When we talk about the technologies that mediate our interactions with public agencies today,” she writes, “we tend to focus on their innovative qualities, the ways they break with convention. Their biggest fans call them ‘disruptors,’ arguing that they shake up old relations of power, producing government that is more transparent, responsive, efficient, even inherently more democratic.” This argument overlooks the ways in which new technologies are necessarily entangled in old systems of power. Moreover, those building these technologies benefit from a privilege that both shields them from and blinds them to the ramifications of their work on those most marginalized politically and economically.
Without a purposeful effort to address systemic inequalities, technologies will only make things worse. APIs will only make things worse. Instead, we must be part of the work of rethinking these old systems, listening to those on the margins, and reorienting our technological practices towards equity and justice.
I’ve been reading about all the work Facebook and Twitter have been doing over the last couple of weeks to begin asserting more control over their API applications. I’m not talking about the deprecation of APIs, that is a separate post. I’m focusing on them reviewing applications that have access to their API, and shutting off access to the ones who are’t adding value to the platform and violating the terms of service. Doing the hard work to maintain a level of quality on the platform, which is something they should have been doing all along.
I don’t want to diminish the importance of the work they are doing, but it really is something that should have been done along the way–not just when something goes wrong. This kind of behavior really sets the wrong tone across the API sector, and people tend to focus on the thing that went wrong, rather than the best practices of what you should be doing to maintain quality across API operations. Other API providers will hesitate launching public APIs because they’ll not want to experience the same repercussions as Facebook and Twitter have, completely overlooking the fact that you can have public APIs, and maintain control along the way. Setting the wrong precedent for API providers to emulate, and damaging the overall reputation of operating public APIs.
Facebook and Twitter have both had the tools all along to police the applications using their APIs. The problem is the incentives to do so, and to prioritize these efforts isn’t there, due to an imbalance with their business model, and a lack of diversity in their leadership. When you have a bunch of white dudes with a libertarian ethos pushing a company towards profitability with a advertising driven business model, investing in quality control at the API management layer just isn’t a priority. You want as may applications, users, and activity as you possibly can, and when you don’t see the abuse, harassment, and other illnesses, there really is no problem from your vantage point. That is, until you get called out in the press, or are forced to testify in front of congress. The reasons us white dudes get away with this is that there are no repercussions, we just get to ignore until it becomes a problem, apologize, perform a little bit to show we care, and wait until the next problem occurs.
This is the wrong API model to put out there. API providers need to see the benefits of properly reviewing applications that want access to their APIs, and the value of setting a higher bar for how applications use the API. There should be regular reviews of active APIs, and audits of how they are accessing, storing, and putting resources to work. This isn’t easy work, or inexpensive to do properly. It isn’t something you can put off until you get in trouble. It is something that should be done from the beginning, and conducted regularly, as part of the operations of a well funded team. You can have public APIs for a platform, and avoid privacy, security, and other shit-shows. If you need an example of doing it well, look at Slack, who has a public API that is successful, even with a high level of bot automation, but somehow manages to stay out of the spotlight for doing dumb things. It is because their API management practices are in better alignment with their business model–the incentives are there.
For the next 3-5 years I’m going to have to hear from companies who aren’t doing public APIs, because they don’t want to make the same mistake as Facebook and Twitter. All because Facebook and Twitter have been able to get away with such bad behavior for so long, avoid doing the hard work of managing their API platforms, and receive so much bad press. All in the name of growth and profits at all cost. Now, I’m going to have to write a post every six months showcasing Facebook and Twitter as pioneers for how NOT to run your platforms, explaining the importance of healthy API management practices, and investing in your API teams so they have the resources to do it properly. I’d rather have positive role models to showcase rather than poorly behaved role models who I have to work overtime to change perception and alter API provider’s behavior. As an API community let’s learn from what has happened and invest properly in our API management layers, properly screen and get to know who is building application on our resources, and regularly tune into and audit their behavior. Yes, it takes more investment, time, and resources, but in the end we’ll all be better off for it.
I was enjoying the REST API Notes newsletter from my friend Matthew Reinbold (@libel_vox) today, and wanted to share my thoughts on his mention of my work while it was top of my mind. I always enjoy what Matt has to say, and regularly encourage him writing on his blog, and keep publishing his extremely thoughtful newsletter. It is important for the API sector to have many thoughtful, intelligent voices breaking down what is going on. I recommend subscribing to REST API Notes if you haven’t, you won’t regret it.
Anyways, Matt replied with the following about my Facebook response:
Kin Lane did a fine job identifying the mechanisms most companies already have in place to mitigate the kind of bad behavior displayed by Cambridge Analytica. It is a good starting point. However, I do want to challenge one of his assertions. Kin implies that OAuth consent is a sufficient control for folks to manage their data. While better than nothing, I maintain that most consumers are incapable of making informed decisions. It’s not a question of their intelligence. It is a question of complexity and incentives for the business to be deliberately opaque.
I agree. To quote my other friend Mehdi Medjaoui (@medjawii), “OAuth is flawed, but it is the best we have”. I would say that “consumers are incapable of making informed decisions”, because we’ve crafted the world this way, and our profit margins depend on customers not being able to make informed decisions. It is how markets work things out, and “smart” people get ahead, and all that bullshit. You see this same thing playing out at the terms of service level as well. As a consumer of online services, I am regularly incapable of being able to make informed decisions around how my data is being used, in exchange for using a free web application. Is it because I’m not smart? No, it is because terms of service are purposefully confusing, verbose, legalize, and mind numbing. Why? So I don’t read them, understand them, and run the other way.
This is intentional. You see this in the Tweet responses from Facebook engineers as they call all of us ignorant fools, and say that we agreed to all of this. The problem isn’t OAuth. The problem is intentional deceit and manipulation by corporations, and for some reason we keep showcasing this type of behavior as being savvy, smart, and just doing business. There is no reason that terms of service or OAuth flows can’t be helpful, informative, and in plain language. Platforms just choose to not invest in these areas, because an uninformed user means more profit for the platform, its investors, and shareholders. The lack of available startups, services, and tooling at the terms of service and privacy layer of technology sector demonstrates what a scam all of this has been. If startups cared about their users, we would have seen investment in tools and services that help us all make sense of things. Period.
I’ve sat in multi-day OAuth negotiation sessions on behalf of the White House, negotiating OAuth scopes on behalf of power customers. I’ve seen how strong armed the corporations can be, and how little defense the average consumer has. I’ve seen technology platforms intentionally complexify things over and over. I’ve also seen plain language terms of service that actually make sense–read Slack’s for an example. I’ve seen OAuth flows that protect a user’s interest, however I haven’t seen ANY investment in actually educating end-users about what is OAuth, and why it matters. You know, like the personal finance classes we all get in high school? (I wrote that shit in 2013!!!) Why aren’t we educating the average consumer about managing their personal data and privacy? Why aren’t we setting standards for OAuth scopes, and flows for technology platforms? Good question!
I’ve spent the last eight years trying to encourage platforms to be better citizens using their APIs. I’ve been aggregating and showcasing the healthy building blocks they can use to better serve their customers, and provide more observability into how they operate. At this point, I feel like the tech sector has had their chance. Y’all chose to willfully ignore the interest of end-users, and get rich on all of our data. I know you all will cry foul now that the regulatory winds are starting to blow. Too bad. You had you chance. I’m going to focus all my energy and resources into educating policy and lawmakers about how they can use APIs, OAuth, and other building blocks already in use to put y’all into check. There is no reason the average consumer can’t make an informed decision around the terms of service of the platforms they depend on, and intelligently participate in conversations around access to their data using OAuth. As Matt said, “it is a question of complexity and incentives for the business to be deliberately opaque.”–regulations and policy is how we shift that.
I’ve been learning more about the EU General Data Protection Regulation (GDPR) recently, and have been having conversation about compliance with companies in the EU, as well as the US. In short, GDPR requires anyone working with personal data to be up front about the data they collect, making sure what they do with that data is observable to end-users, and takes a privacy and security by design approach when it comes to working with all personal data. While the regulations seems heavy handed and unrealistic to many, it really reflects a healthy view of what personal data is, and what a sustainable digital future will look like.
The biggest challenge with becoming GDPR compliant is the data mess most companies operate in. Most companies collect huge amounts of data, believing it is essential to the value they bring to the table, without no real understanding of everything that is being collected, and any logical reasons behind why it is gathered, stored, and kept around. A “gather it all”, big data mentality has dominated the last decade of doing business online. Database groups within organizations hold a lot of power and control because of the data they possess. There is a lot of money to be made when it comes to data access, aggregation, and brokering. It won’t be easy to unwind and change the data-driven culture that has emerged and flourished in the Internet age.
I regularly work with companies who do not have coherent maps of all the data they possess. If you asked them for details on what they track about any given customer, very few will be able to give you a consistent answer. Doing web APIs has forced many organizations to think more deeply about what data they posses, and how they can make it more discoverable, accessible, and usable across systems, web, mobile, and device applications. Even with this opportunity, most large organizations are still struggling with what data they have, where it is stored, and how to access it in a consistent, and meaningful way. Database culture within most organizations is just a mess, which contributes to why so many are freaking out about GDPR.
I’m guessing many companies are worried about complying with GDPR, as well as being able to even respond to any sort of regulatory policing event that may occur. This fear is going to force data stewards to begin thinking about the data the have on hand. I’ve already had conversations with some banks who are working on PSD2 compliant APIs, who are working in tandem on GDPR compliance efforts. Both are making them think deeply about what data they collect, where it is stored, and whether or not it has any value. Something I’m hoping will force some companies to stop collecting some of the data all together, because it just won’t be worth justifying its existence in the current cyber(in)secure, and increasingly accountable regulatory environment.
Doing APIs and becoming GDPR compliant go hand in hand. To do APIs you need to map out the data landscape across your organization, something that will contribute to GDPR. To respond to GDPR events, you will need APIs that provide access to end-users data, and leverage API authentication protocols like OAuth to ensure partnerships, and 3rd party access to end-users data are accountable. I’m optimistic that GDPR will continue to push forward healthy, transparent, and observable conversations around our personal data. One that focuses on, and includes the end-users who’s data we are collecting, storing, and often time selling. I’m hopeful that the stakes become higher, regarding the penalty for breaches, and shady brokering of personal data, and that GDPR becomes the normal mode of doing business online in the EU, and beyond.
I’m processing the recent announcement by Facebook to shut off the access of Cambridge Analytica to it’s valuable social data. The story emphasizes the importance of real time awareness and response to API consumers at the API management level, as well as the difficulty in ensuring that API consumers are doing what they should be with the data and content being made available via APIs. Access to platforms using APIs is more art than science, but there are some proven ways to help mitigate serious abuses, and identify the bad actors early on, and prevent their operation within the community.
While I applaud Facebook’s response, I’m guessing they could have taken more action earlier on. Their response is more about damage control to their reputation, after the fact, than it is about preventing the problem from happening. Facebook most likely had plenty of warning signs regarding what Aleksandr Kogan, Strategic Communication Laboratories (SCL), including their political data analytics firm, Cambridge Analytica, were up to. If they weren’t than that is a problem in itself, and Facebook should be investing in more policing of their API consumers activity, as they claim they are doing in their release.
If Aleksandr Kogan has that many OAuth tokens for Facebook users, then Facebook should be up in his business, better understanding what he is doing, where his money comes from, and who is partners are. I’m guessing Facebook probably had more knowledge, but because it drove traffic, generated ad revenue, and was in alignment with their business model, it wasn’t a problem. They were willing to look the other way with the data sharing that was occurring, until it became a wider problem for the election, our democracy, and in the press. Facebook should have more awareness, oversight, and enforcement at the API management layer of their platform.
This situation I think highlights another problem of doing APIs, and ensuring API consumers are behaving appropriately with the data, content, and algorithms they are accessing. It can be tough to police what a developer does with data once they’ve pulled from an API. Where they store it, and who they share it with. You just can’t trust that all developers will have the platform, and it’s end-users best interest in mind. Once the data has left the nest, you really don’t have much control over what happens with it. There are ways you can identify unhealthy patterns of consumption via the API management layer, but Aleksandr Kogan’s quizzes probably would appear as a normal application pattern, with no clear signs of the relationships, and data sharing going on behind the scenes.
While I sympathize with Facebook’s struggle to police what people do with their data, I also know they haven’t invested in API management as much as they should have, and they are more than willing to overlook bad behavior when it supports their bottom line. The culture of the tech space supports and incentivizes this type of bad behavior from platforms, as well as consumers like Cambridge Analytica. This is something that regulations like GDPR out of the EU is looking to correct, but the culture in the United States is all about exploitation at this level, that is until it becomes front page news, then of course you act concerned, and begin acting accordingly. The app, big data, and API economy runs on the generating, consuming, buying, and selling of people’s data, and this type of practice isn’t going to go away anytime soon.
As Facebook states, they are taking measures to reign in bad actors in their developer community by being more strict in their application review process. I agree, a healthy application review process is an important aspect of API management. However, this does not address the regular review of applications usage at the API management level, assessing their consumption as they accumulate access tokens, to more user’s data, and go viral. I’d like to have more visibility into how Facebook will be regularly reviewing, assessing, and auditing applications. I’d even go so far as requiring more observability into ALL applications that are using the Facebook API, providing a community directory that will encourage transparency around what people are building. I know that sounds crazy from a platform perspective, but it isn’t, and would actually force Facebook to know their customers.
If platforms truly want to address this problem they will embrace more observability around what is happening in their API communities. They would allow certified and verified researchers and auditors to get at application level consumption data available at the API management layer. I’m sorry y’all, self-regulation isn’t going to cut it here. We need independent 3rd party access at the platform API level to better understand what is happening, otherwise we’ll only see platform action after problems occur, and when major news stories are published. This is the beauty / ugliness of APIs. The cats out of the bag, and platforms need them to innovate, deliver resources to web, mobile, and device applications, as well as remain competitive. APIs also provide the opportunity to peek behind the curtain, and better understand what is happening, and profile the good and the bad actors within each ecosystem–let’s take advantage of the good here, to help regulate the bad.
404: Not Found
I received a tweet from my friend Kelly Taylor with USDS, asking for any information regarding establishing an “approve access to production data” for developers. He is working on an OAuth + FHIR implementation for the Centers for Medicare and Medicaid Services (CMS) Blue Button API. Establishing a standard approach for on-boarding developers into a production environment always makes sense, as you don’t want to give access to sensitive information without making sure the company, developer, and application has been thoroughly vetted.
As I do with my work, I wanted to think through some of the approaches I’ve come across in my research, and share some tips and best practices. The Blue Button API team has a section published regarding how to get your application approved, but I wanted to see if I can expand on, while also helping share this information with other readers. This is a relevant use case that I see come up regularly in healthcare, financial, education, and other mainstream industries.
Virtualization & Sandbox The application approval conversation usually begins with ALL new developers being required to work with a sandboxed set of APIs, only providing production API access to approved developers. This requires having a complete set of virtualized APIs, mimicking exactly what would be used in production, but in a much safer, protected environment. One of the most important aspects of this virtualized environment is that there also needs to be robust sets of virtualized data, providing as much parity regarding what developers will experience when they enter the production environment. The sandbox environment needs to be as robust and reliable as the production, which is a mistake I see made over and over from providers, where the sandbox isn’t reliable, or as functional, and developers never are able to reach production status in a consistent and reliable way.
Doing a Background Check Next, as reflected in the Blue Button teams approach, you should be profiling the company and organization, as well as the individual behind each application. You see company’s like Best Buy refusing any API signup that doesn’t have an official company domain that can be verified. In addition to requiring developers provide a thorough amount of information about who they are, and who they work for, many API providers are using background and profiling services like Clearbit to obtain more details about a user based upon their email, IP address, and company domain. Enabling different types of access to API resources depending on the level of scrutiny a developer is put under. I’ve seen this level of scrutiny go all the way up to requiring the scanning of drivers license, and providing corporate documents before production access is approved.
Purpose of Application One of the most common filtering approaches I’ve seen centers around asking developer about the purpose of their application. The more detail the better. As we’ve seen from companies like Twitter, the API provider holds a lot of power when it comes to deciding what types of applications will get built, and it is up to the developer to pitch the platform, and convince them that their application will serve the mission of the organization, as well as any stakeholders, and end-users who will be leveraging the application. This process can really be a great filter for making sure developers think through what they are building, requiring them to put forth a coherent proposal, otherwise they will not be able to get full access to resources. This part of the process should be conducted early on in the application submission process, reducing frustrations for developers if their application is denied.
Syncing The Legal Department Also reflected in the Blue Button team’s approach is the syncing of the legal aspects of operating an API platform, and it’s applications. Making sure the application’s terms of service, privacy, security, cookie, branding, and other policies are in alignment with the platform. One good way of doing this is offering a white label edition of the platforms legal documents for use by the each application. Doing the heavy legal work for the application developers, while also making sure they are in sync when it comes to the legal details. Providing legal develop kits (LDK) will grow in prominence in the future, just like providing software development kits (SDK), helping streamline the legalities of operating a safe and secure API platform, with a wealth of applications in its service.
Live or Virtual Presentation Beyond the initial pitch selling an API provider on the concept of an application, I’ve seen many providers require an in-person, or virtual demo of the working application before it can be added to a production environment, and included in the application gallery. It can be tough for platform providers to test drive each application, so making the application owners do the hard work of demonstrating what an application does, and walking through all of its features is pretty common. I’ve participated on several judging panels that operate quarterly application reviews, as well as part of specific events, hackathons, and application challenges. Making demos a regular part of the application lifecycle is easier to do when you have dedicated resources in place, with a process to govern how it will all work in recurring batches, or on a set schedule.
Getting Into The Code As part of the application review process many API providers require that you actually submit your code for review via Github. Providing details on ALL dependencies, and performing code, dependency, and security scans before an application can be approved. I’ve also see this go as far as requiring the use of specific SDKs, frameworks, or include proxies within the client layer, and requiring all HTTP calls be logged as part of production applications. This process can be extended to include all cloud and SaaS solutions involved, limiting where compute, storage, and other resources can be operated. Requiring all 3rd party APIs in use be approved, or already on a white list of API providers before they can be put to use. This is obviously the most costly part of the application review process, but depending on how high the bar is being set, it is one that many providers will decide to invest in, ensuring the quality of all applications that run in a production environment.
Regular Review & Reporting One important thing about the application review process is that it isn’t a one time process. Even once an application is accepted an added into the production environment, this process will need to be repeated for each version release of the application, along with the changes to the API. Of course the renewal process might be shorter than the initial approval workflow, but auditing and regular check-in should be common, and not forgotten. This touches on the client level SDK, and API management logging needs of the platform, and that regular reporting upon application usage and activity should be available in real time, as well as part of each application renewal. API operations is always about taking advantage the real time awareness introduced at the API consumption layer, and staying in tune with the healthy, and not so healthy patterns that emerge from logging everything an application is doing.
Business Model It is common to ask application developers about their business model. The absence of a business model almost always reflects the underlying exploitation and sale of data being access or generated as part of application’s operation. Asking developers how they will make money and sustain their operations, along with regular checkins to make sure it is truly in effect, is an easy to ensure that applications are protecting the interests of the platform, its partners, and the applications end-users.
There are many other approaches I’ve seen API providers require before accepting an application into production. However, I think we should also be working hard to keep the process simple, and meaningful. Of course, we want a high bar for quality, but as with everything in the API world, there will always be compromises in how we deliver on the ground. Depending on the industry you are operating the bar will be made higher, or possibly lowered a little to allow for more innovation. I’ve included a list of some of the application review process I found across my research–showing a wide range of approaches across API providers we are all familiar with. Hopefully that helps you think through the application review process a little more. It is something I’ll write about again in the future as I push forward my research, and distill down more of the common building blocks I’m seeing across the API landscape.
Some Leading Application Review Processes
I was integrating with the Clearbit API, doing some enrichment of the API providers I track on, and I found their API stack pretty interesting. I’m just using the enrichment API, which allows me to pass it a URL, and it gives me back a bunch of intelligence on the organization behind. I’ve added a bookmarklet to my browser, which allows me to push it, and the enriched data goes directly into my CRM system. Delivering what it the title says it does–enrichment.
Next up, I’m going to be using the Clearbit Discovery API to find some potentially new companies who are doing APIs in specific industries. As I head over the to the docs for the API, I notice the other three APIs, and I feel like they reflect the five stages of transition to the data intelligence dark side.
- Enrichment API - The Enrichment API lets you look up person and company data based on an email or domain. For example, you could retrieve a person’s name, location and social handles from an email. Or you could lookup a company’s location, headcount or logo based on their domain name.
- Discovery API - The Discovery API lets you search for companies via specific criteria. For example, you could search for all companies with a specific funding, that use a certain technology, or that are similar to your existing customers.
- Prospector API - The Prospector API lets you fetch contacts and emails associated with a company, employment role, seniority, and job title.
- Risk API - The Risk API takes an email and IP and calculates an associated risk score. This is especially useful for figuring out whether incoming signups to your service are spam or legitimate, or whether a payment has a high chargeback risk.
- Reveal API - Reveal API takes an IP address, and returns the company associated with that IP. This is especially useful for de-anonymizing traffic on your website, analytics, and customizing landing pages for specific company verticals.
Your journey to the dark side begins innocently enough. You just want to know more about a handful of companies, and the data provided is a real time saver! Then you begin discovering new things, finding some amazing new companies, products, services, and insights. You are addicted. You begin prospecting full time, and actively working to find your latest fix. Then you begin to get paranoid, worried you can’t trust anyone. I mean, if everyone is behaving like you, then you have to be on your guard. That visitor to your website might be your competitor, or worse! Who is it? I need to know everyone who comes to my site. Then in the darkest depths of your binges you are using the reveal API and surveilling all your users. You’ve crossed to the dark side. Your journey is complete.
Remember kids, this is all a very slippery slope. With great power comes great responsibility. One day you are a scrappy little startup, and the next your the fucking NSA. In all seriousness. I think their data intelligence stack is interesting. I do use the enrichment API, and will be using the discovery API. However, we do have to ask ourselves, do we want to be surveilling all our users and visitors. Do we want to be surveilled on every site we visit, and on every application we use? At some point we have to make sure and check how far towards the dark side we’ve gone, and ask ourselves, is this all really worth it?
P.S. This story reminds me I totally flaked on delivering a white paper to Clearbit on the topic of risk. Last year was difficult for me, and I got swamped….sorry guys. Maybe I’ll pick up the topic and send something your way. It is an interesting one, and I hope to have time at some point.
I remember when almost all the APIs out there gave us developers access to things we couldn’t ever possibly get on our own. Some of it was about the network effect with the early Amazon and eBay marketplaces, or Flickr and Delicious, and then Twitter and Facebook. Then what really brought it home was going beyond the network effect, and delivering resources that were completely out of our reach like maps of the world around us, (seemingly) infinitely scalable compute and storage, SMS, and credit card payments. In the early days it really seemed like APIs were all about giving us access to something that was out of our reach as startups, or individuals.
While this still does exist, it seems like many APIs have flipped the table and it is all about giving them access to our personal and business data in ways that used to be out of their reach. Machine learning APIs are using parlour tricks to get access to our internal systems and databases. Voice enablement, entertainment, and cameras are gaining access to our homes, what we watch and listen to, and are able to look into the dark corners of our personal lives. Tinder, Facebook, and other platforms know our deep dark secrets, our personal thoughts, and have access to our email and intimate conversations. The API promise seems to have changed along the way, and stopped being about giving us access, and is now about giving them access.
I know it has always been about money, but the early vision of APIs seemed more honest. It seemed more about selling a product or service that people needed, and was more straight up. Now it just seems like APIs are invasive. Being used to infiltrate our professional and business worlds through our mobile phones. It feels like people just want access to us, purely so they can mine us and make more money. You just don’t see many Flickrs, Google Maps, or Amazon EC2s anymore. The new features in mobile devices we carry around, and the ones we install in our home don’t really benefit us in new and amazing ways. They seem to offer just enough to get us to adopt them, and install in our life, so they can get access to yet another data point. Maybe it is just because everything has been done, or maybe it is because it has all been taken over by the money people, looking for the next big thing (for them).
Oh no! Kin is ranting again. No, I’m not. I’m actually feeling pretty grounded in my writing lately, I’m just finding it takes a lot more work to find interesting APIs. I have to sift through many more emails from folks telling me about their exploitative API, before I come across something interesting. I go through 30 vulnerabilities posts in my feeds, before I come across one creative story about something platform is doing. There are 55 posts about ICOs, before I find an interesting investment in a startup doing something that matters. I’m willing to admit that I’m a grumpy API Evangelist most of the time, but I feel really happy, content, and enjoying my research overall. I just feel like the space has lost its way with this big data thing, and are using APIs to become more about infiltrating and extraction, that it is about delivering something that actually gives developers access to something meaningful. I just think we can do better. Something has to give, or this won’t continue to be sustainable much longer.
Just admit it, you could care less about your API consumers. You are just playing this whole API game because you read somewhere that this is what everyone should be doing now. You figured you can get some good press out of doing an API, get some free work from developers, and look like you are one of the cool kids for a while. You do the song and dance well, you have developed and deployed an API. It will look like the other APIs out there, but when it comes to supporting developers, or actually investing in the community, you really aren’t that interested in rolling up your sleeves and making a difference. You just don’t really care that much, as long as it looks like you are playing the API game.
Honestly, you’d do any trend that comes along, but this one has so many perks you couldn’t ignore it. Not only do you get to be API cool, you did all the right things, launched on Product Hunt, and you have a presence at all the right tech events. Developers are lining up to build applications, and are willing to work for free. Most of the apps that get built are worthless, but the SDKs you provide act as a vacuum for data. You’ve managed to double your budget by selling the data you acquire to your partners, and other data brokers. You could give away your API for free, and still make a killing, but hell, you have to keep charging just so you look legit, and don’t raise any alarm bells.
It is hard to respect developers who line up and work for free like this. And the users, they are so damn clueless regarding what is going on, they’ll hand over their address book and location in real-time without ever thinking twice. This is just to easy. APIs are such a great racket. You really don’t have to do anything but blog everyone once in a while, show up at events and drink beer, and make sure the API doesn’t break. What a sweet gig huh? No, not really, you are just a pretty sad excuse of a person, and it will catch up with you somewhere. You really represent everything wrong with technology right now, and are contributing to the world being a worse place than it already is–nice job!
Note: If my writing is a little dark this week, here is a little explainer–don’t worry, things will back to normal at API Evangelist soon.
I was reading about how Carbon Black, an endpoint detection and response (EDR) service, was exposing customer data via a 3r party API service they were using. The endpoint detection and response provider allows customers to optionally scan system and program files using the VirusTotal service. Carbon Black did not realize that premium subscribers of the VirusTotal service get access to the submitted files, allowing an company or government agency with premium access to VirusTotal’s application programming interface (API) can mine those files for sensitive data.
It provides a pretty scary glimpse at the future of privacy and security in a world of 3rd party APIs if we don’t think deeply about the solutions we bake into our applications and services. Each API we bake into our applications should always be scrutinized for privacy and security concerns, making sure end-users aren’t being subjected to unnecessary situations. This situation sounds like it was both API provider and consumer contributing to the privacy violation, and adjusting platform access levels, and communicating with API consumers would be the best path forward.
Beyond just this situation, I wanted to write about this topic as a cautionary tale for the unfolding machine learning API landscape. Make sure we are thinking deeply about what data and content we are making available to platforms via artificial intelligence and machine learning APIs. Make sure we are asking the hard questions about the security and privacy of data and content we are running through machine learning APIs. Make sure we are thinking deeply about what data and content sets we are running through the machine learning APIs, and reducing any unnecessary exposure of personal data, content, and media.
It is easy to be captivated by the magic of artificial intelligence and machine learning APIs. It is easy to view APIs as something external, and not much of a privacy or security threat. However, with each API call we are inviting a 3rd party API into our databases, files, and other private systems. Let’s make sure we have an honest conversation with our API providers about how data and content is accessed, stored, cached, and used as part of any AI or ML process. Let’s make sure we get clarification on which partners, or other 3rd party providers are getting access to data and content that is indexed and executed as part of AI and ML API requests and responses. How long are videos or images stored? How long is data stored?
I’m seeing more discussion around dependencies going on in the API space. Which software libraries, and APIs are we depending on for our applications and services. I’m feeling like this conversation is going to continue expanding and security, privacy, and observability is going to become a more significant part of these dependency discussions. It will be a conversation that continues to push API deployment on-premise, and on-premise, being observable about how ML and AI API operations are being logged, stored, and track on. I’m going to keep watching how APIs are intentionally or unintentionally violating security and privacy like this, and keep an eye on the API dependency conversation to see how it evolves as part of this security and privacy discussion.
404: Not Found
I believe that APIs can bring some important transparency to the web, mobile, and device applications that seem to be invading our life. I hesitate using the word transparency because it has been weaponized by Wikileaks and others in the current cyber(in)secure landscape, but for the purposes of this story, it will work. APIs by default do not mean transparency, but when done in the right way they can pull back the curtain a little on what is going on when a company, organization, institution, or agency behind is truly committed to transparency.
I’ve long had a portion of my research dedicated to studying intentional transparency efforts by API providers, giving me a place to publish any organizations, links, and stories that I publish on the subject of API transparency. As part of my API research I was looking at some university API efforts the other day when I came across the Apply Magic Sauce API, a personalisation engine that accurately predicts psychological traits from digital footprints of human behavior, which had a pretty interesting section dedicated to the subject of transparency. Here is some background on their approach:
Our methods have been peer-reviewed and published in open access journals since 2013, and new services that sound similar to Apply Magic Sauce API (AMS) are springing up every day. As this technology becomes more accessible and its impact increases, we would like to ensure that citizens have clarity on who we do and do not work with. We are therefore committed to keeping an up to date list of every organisation that we have formally authorised to use AMS for commercial purposes. These clients are advised to follow our ethical guidelines and are bound by our terms and conditions regarding the need to obtain the informed consent of individuals about whom predictions are made. We encourage other providers of predictive technologies to honour the principles of privacy, transparency and relevance and publish a similar list of their own.
The Apply Magic Sauce API provides me with a solid example of an existing institution, “…committed to keeping an up to date list of every organization that we have formally authorised to use AMS for commercial purposes”, and encouraging their partners to “publish a similar list of their own”. What an important example to set when it comes to APIs, especially APIs that involve the amount of personally identifiable information (PII) that a social network possesses. It provides a positive model that ANY application who allows for the OAuth’ing of a user via their Twitter, Facebook, or another social network should be emulating.
Honestly, I’m still not 100% sure about what they are up to at the University of Cambridge with their Apply Magic Sauce API–I am still getting going with my dive into their operations. However, from the amount of work they’ve put into the ethics behind their API platform, as well as the high bar for transparency being set, I’m willing to give them the benefit of the doubt that they are up to good things. I’ll keep diving into their API, and monitor the activity in the community to make sure they are standing behind their pledge, and see whether or not their API partners are respecting the pledge as well.
I am hoping that the University of Cambridge provides me with a solid ethical example of not just how an API provider can behave and communicate around the transparency of their API consumption, but also how they can set the same expectations within their API community. If I had my way this would be the standard operating procedure for EVERY company, organization, institution, and a government agency that possess any PII for ANY citizen–100% transparency of EVERY single partner, customer, and application developer who has access to any personal data.
I know this will sound funny to many folks, but when I see APIs, I see language and communication, and humans learning to speak with each other in this new digital world we are creating for ourselves. My friend Erik Wilde (@dret) tweeted a reminder for me that APIs are indeed a language.
APIs are languages. show me one #API aspect that cannot be adequately framed in the context of language design practices and challenges.— Erik Wilde (@dret) June 4, 2017
Every second on our laptops and mobile phone we are communicating with many different companies and individuals. With each wall post, Tweet, photo push, or video stream we are communicating with our friends, family, and the public. Each of these interactions is being defined and facilitated using an API. An API call just for saying something in text, in an image, or video. API is the digital language we use to communicate online and via our mobile devices.
Uber geeks like me spend their days trying to map out and understand these direct interactions, as well as the growing number of indirect interactions. For every direct communication, there are usually numerous other indirect communications with advertisers, platform providers, or maybe even law enforcement, researchers, or anyone else with access to the communication channels. We aren’t just learning to directly communicate, we are also being conditioned to participate indirectly in conversations we can’t see–unless you are tuned into the bigger picture of the API economy.
When we post that photo, companies are whispering about what is in the photo, where it was taken, and what meaning it has. When we share that news link of Facebook, companies have a discussion about the truthfulness and impact of the link, maybe the psychological profile behind the link and where we fit into their psychological profile database. In some scenarios, they are talking directly about us personally like we are sitting in the room, other times they are talking about us like we are just a number in a larger demographic pool.
In alignment with the real world, the majority of these conversations being held between men, behind closed doors. Publicly the conversations are usually directed by people with a bullhorn, talking over others, as well as whispering behind, and around people while they completely unaware that these conversations about them are even occurring. The average person is completely unaware these conversations are happening. They can’t hear the whispering, or just do not speak the language that is being used around them, about them, each moment of each day.
Those of us in the know are scrambling to understand, control, and direct the conversations that are occuring. There is a lot of money to be made when you are part of these conversations. Or at least have a whole bunch of people on your platform to have a conversation about, or around. People don’t realize that for every direct conversation you have online, there are probably 20 conversations going on about this conversation. What will they buy next? Who do they know? What is in that photo they just shared? Is this related post interesting to them? API-driven echoes of conversation upon conversations into infinity.
Sometimes I feel like Dr. Xavier from the X-men in that vault room connected to the machine when I am on the Internet studying APIs. I’m seeing millions of conversations going on–it is deafening. I don’t just see or hear the direct conversations, I hear the deafening sounds of advertisers, hackers, researchers, police, government, and everyone having a conversation around us. Many folks feel like the average person shouldn’t be included in the conversation–they do not have the interest or awareness to even engage. To me, it just feels like a new secretive world augmenting our physical worlds, where our digital selves are learning to speak with each other. What troubles me though, is that not everyone is actually engaged in the conversations they are included in, and are often asleep or sedated while their personal digital self is being manipulated, exploited, and p0wn3d.
The way we view personal data in this early Internet age will continue to change and evolve, until one day we are looking back at this period and find we are shocked regarding how we didn’t see people’s digital bits as their own, and something we should respect and protect the privacy and security of.
Right now my private, network shared, or even public posts are widely viewed as a commodity, something the platform operator, and other companies have every right to buy, sell, mine, extract, and generally do as they wish. Very few startups see these posts as my personal thoughts, they simply see the opportunity for generating value and revenue as part of their interests. Sure, there are exceptions, but this is the general view of personal data in this Internet age.
We are barely 20 years into the web being mainstream, and barely over five years into mobile phones being mainstream. We are only beginning to enter even more immersions of Internet in our lives via our cars, televisions, appliances, and much more. We are only getting going when it comes to generating and understanding personal data, and the impacts of technology on our privacy, security, and overall human well-being. What is going on right now will not stay the norm, and we are already seeing signs of pushback from humans regarding ownership of their data, as well as our privacy and security.
While technology companies and their investors seem all powerful right now, and many humans seem oblivious to what is going, the landscape is shifting, and I’m confident that humans will prevail, and there will be pushback that begins helping us all define our digital self, and reclaiming the privacy and security we are entitled to. When we look back on this period in 50 years we will not look favorably on companies and government agencies who exploited human’s personal data. We will see the frenzy over big data generation, accumulation, and treating it like a commodity, over something that belongs to a human as deeply troubling.
Which side of history are you going to be on?
When I suggest modern approaches to API management be applied to public data I always get a few open data folks who push back saying that public data shouldn’t be locked up, and needs to always be publicly available–as the open data gods intended. I get it, and I agree that public data should be easily accessible, but there are increasingly a number of unintended consequences that data stewards need to consider before they publish public data to the web in 2017.
I’m going through this exercise with my recommendations and guidance for municipal 211 operators when it comes to implementing Open Referral’s Human Services Data API (HSDA). The schema and API definition centers around the storage and access to organizations, locations, services, contacts, and other key data for human services offered in any city–things like mental health resources, suicide assistance, food banks, and other things we humans need on a day to day basis.
This data should be publicly available, and easy to access. We want people to find the resources they need at the local level–this is the mission. However, once you get to know the data, you start understanding the importance of not everything being 100% public by default. When you come across listings Muslim faith, and LGBTQ services, or possibly domestic violence shelters, and needle exchanges. They are numerous types of listings where we need to be having sincere discussions around security and privacy concerns, and possibly think twice about publishing all or part of a dataset.
This is where modern approaches to API management can lend a hand. Where we can design specific endpoints, that pull specialized information for specific groups of people, and define who has access through API rate limiting. Right now my HSDA implementation has two access groups, public and private. Every GET path is publicly available, and if you want to POST, PUT, or DELETE data you will need an API key. As I consider my API management guidance for implementors, I’m adding a healthy dose of the how and why of privacy and security using existing API management solutions and practice.
I am not interested in playing decider when it comes to what data is public, private, and requires approval before getting access. I’m merely thinking about how API management can be applied in the name of privacy and security when it comes to public data, and how I can put tools in the hands of data stewards, and API providers that help them make the decision about what is public, private, and more tightly controlled. The trick with all of this is how transparent should providers be with the limits and restrictions imposed, and communicate the offical stance with all stakeholders appropriately when it comes to public data privacy and security.
I just got back from two weeks in the United Kingdom, which was my first international travel in a Trump and Brexit dystopia. My travel leaving the country, and coming back through LAX were uneventful, but it gave me the opportunity to begin pulling together my procedures for crossing borders with my digital devices.
Pausing, and thinking about which devices I will be traveling with, what I am storing on these devices, and the applications I'm operating provided a significant opportunity to get my security and privacy house in order. It allows me to go through my digital self, think about the impact of traveling with too much data, and prepare and protect myself from potentially compromising situations at the border.
When I started my planning process I had invested in a burner Google Chromebook, but because I'm currently working on some projects that require Adobe products, I resorted to taking my older Macbook Air, that if I had confiscated I would be alright leaving behind (who wants a violated machine?). When it comes to my iPhone and iPad, I cannot afford to leave them behind, as I need the most recent iPhone to work with my new Mavic drone, and my Osmo+ video camera--both from DJI.
Any law enforcement looking to get access to my MacBook, iPhone, or iPad are going to go after all my essential bits: contacts, messages, images, audio, and video. So I made sure that all of these areas are cleared before I crossed any border. I keep only the applications I need to navigate and stay in touch with key people--no social media except Twitter. Since I use OnePass for my password management, I don't actually know any of my passwords, and once I remove the OnePass application, I can't actually get into anything I'm not already logged into.
The process of developing my border processing procedure also helped me think through my account hierarchy. My iCloud and Google are definitely my primary accounts, with everything else remembered by OnePass. I even set up an alternate Kin Lane for iCLoud, Twitter, and Google, which I log into with all of my devices when crossing a border. I made sure all social and messaging applications are removed except for the essentials, and double checked I had two-factor authentication turned on for EVERYTHING.
I store nothing on the iPhone or the iPad. Everything on my Macbook is stored in a synced Dropbox folder, which is removed before any border is crossed. I clear all SD cards and camera storage on the device. Everything is stored in the cloud when I travel, leaving nothing on the device. When you are really in tune with the bits you create and need to operate each day, it isn't much work to minimize your on-device footprint like this. The more you exercise, the easier it is to keep the data you store on-device as minimal as you possibly can. One footnote on storage though--if you can't get all your data uploaded to the cloud in time because of network constraints, store on mini-SD cards, which can be hidden pretty easily.
The areas I focused on as part of my procedure were focused on device storage, application connections to the cloud, and what is baked into the device like address book, etc.--running everything in the lightest, bare bones mode possible. It really sucks that we have to even do this at all, but I actually find the process rewarding--think of it like fasting, but for your digital self. I'm looking forward to furthering refining my approach and keeping it as something that I do EVERY time I cross a border. Eventually, it will just become standard operating procedure, and something I do without thinking, and will definitely begin to impact my more permanent digital footprint--keeping everything I do online as thoughtful, meaningful, and secure as possible.
Evernote employees have not read, and do not read, your note content. We only access notes under strictly limited circumstances: where we have your permission, or to comply with our legal obligations.Your privacy, and your trust in Evernote are the most important things to us. They are at the heart of our company, and we will continue to focus on that now and in the future.
While I am thankful for their change of heart, I wanted to take a moment to point out the wider online environment that incentivizes this type of behavior. This isn't a single situation with Evernote reading our notes, this is the standard mindset for startups operating online, and via our mobile devices in 2017. This is just one situation that was called out, resulting in a change of heart by the platform. Our digital bits are being harvested in the name of machine learning and artificial intelligence across the platforms we depend on daily for our business and personal lives.
In these startup's quest for profit, and ultimately their grand exit, they are targeting our individual and business bits. Use this free web application. Use this free mobile application. Plug this device in at home on your network. Let our machine learning evaluate your database for insights. Let me read your most personal and intimate thoughts in your diary, journal, and notebook. I will provide you with entertainment, convenience, and insight that you would never be able to achieve on your own, without the magic of artificial intelligence.
In my experience, the position a company takes on their API provides an early warning system for these situations. Evernote sent all the wrong signals to their developer community years ago, letting me know it was not a platform I could depend on, and trust for my business operations, let alone my personal, private thoughts. I was able to get off the platform successfully, but the trick in all of this is identifying other platforms who are primed for making similar decisions and helping the average users understand the motives behind all of these "solutions" we let into our lives, so they can make an educated decision on their own.
I am reading through the API task force recommendations out of the Office of the National Coordinator for Health Information Technology (ONC), to help address privacy and security concerns around mandated API usage as part of the Common Clinical Data Set, Medicare, and Medicaid Electronic Health Records. The recommendations contain a wealth of valuable insights around healthcare APIs but are also full of patterns that we should be applying across other sectors of our society where APIs making an impact. To help me work through the task force's recommendations, I will be blogging through many of the different concepts at play.
Beyond the usage of "patient-directed APIs" that I wrote about earlier, I thought the pragmatic view on API privacy and security was worth noting. When it comes to making data, content, and other digital resources available online, I hear the full spectrum of concerns, and it leaves me optimistic to hear government agencies speak about security and privacy in such a balanced way.
Here is a section from the API task force recommendations:
Like any technology, APIs allow new capabilities and opportunities and, like any other technology, these opportunities come with some risks. There are fears that APIs may open new security vulnerabilities, with apps accessing patient records "for evil", and without receiving proper patient authorization. There are also fears that APIs could provide a possible "fire hose" of data as opposed to the "one sip at a time" access that a web site or email interface may provide.
In testimony, we heard almost universally that, when APIs are appropriately managed, the opportunities outweigh the risks. We heard from companies currently offering APIs that properly managed APIs provide better security properties than ad-hoc interfaces or proprietary integration technology.
While access to health data via APIs does require additional considerations and regulatory compliance needs, we believe existing standards, infrastructure, and identity proofing processes are adequate to support patient directed access via APIs today.
The document is full of recommendations on how to strike this balance. It is refreshing to hear such a transparent vision of what APIs can be. They weigh the risks, alongside the benefits that APIs bring to the table while also being fully aware that a "properly managed API" provides its own security. Another significant aspect of these recommendations for me is that they also touch on the role that APIs will play in the regulatory and a compliance process.
I have to admit, the area of healthcare APIs isn't one of the most exciting stacks in the over 50 areas I track on across the API space, but I'm fully engaged with this because of the potential of a blueprint for privacy and security that can be applied with other types of APIs. When it comes to social, location, and other data the bar has been set pretty low when it comes to privacy and security, but health care data is different. People tend to be more concerned with access, security, privacy, and all the things we should already be discussing when it comes to the rest of our digital existence--opening the door for some valuable digital literacy discussions.
Hopefully, I don't run you off with all my healthcare API stories, and you can find some gems in the healthcare API task force's recommendations, like I am.
I always dig it when API stories spin out of control, and I end up down story holes. I'm sure certain people waiting for other work from me do not appreciate it, but these are where some of the best stories in my world come from. As I was writing the story about Best Buy limiting access to their API when you have a free email account, which resulted in the story about Best Buy using Medium for their API platform blog presence, which ended up pushing me to read Medium's terms of service.
Maintaining the legal side of your platform operations on Github, taking advantage of the version control build in makes a lot of sense. Something that also opens the door for using Github Issue Management, and the other more social aspects of Github for assisting in the communication of legal changes, as well as facilitate ongoing conversations around changes in real time. I can see eventually working this into some sort of rating system for API providers, a sort of open source regulatory consideration, that is totally opt in by API platforms -- if you give a shit you'll do it, if you don't, you won't.
The NextWeb had a great story today that Google has redesigned its developer policies with clearer language and visual examples, and normally I don't jsut parrot what the tech blogosphere publishes, but it's an important enough API message, I think it warrants repeating. In my experience, API providers just emulate what they hear in the space, and stories like this need amplifcation.
What Google did isn't rocket surgery, they just simplified the legalize around what they expect of developers. What better way to actually help ensure these best practices around the platform actual happen, then by actually providing simple titles, summary description, images, and other relevant links, on a developer policy page. In most scenarios developers aren't malicious, they are just usually ignorant of the best practices that often exist in the platforms legalize we agree to, but seldom ever read.
My friend Tyler Singletary (@harmophone) stated that, "the organization here for policy is fun, mostly clear, and interactive. A company the size of Google distilling the deep TOS policy into such a user and developer friendly front end is pretty astounding.” -- I agree. Simple like this, when you operate at Google scale can be pretty hard, and is somthing I'd love see standardized across Google offerings.
I will be adding any of the elements from the simplified Google Play developer policy that are missing in my API terms of service, privacy, licensing, and branding research. I will also add the concept of having visuals for each area, or building block, and add the common them of "keep it simple", helping API providers remember that they actually want developers reading and understanding this stuff, not having it buried in the legaleze that nobody will give shit about.
As we get close to wrapping up the first month in 2015, it is clear that Internet security and privacy will continue to be front and center this year. As technology continues to play a central role in our personal and business lives--security, transparency, and respect for privacy is only growing more critical.
I know I'm biased in thinking that APIs will continue to take a central role in this conversation, but I feel it is true. Many of the existing conversations around security about platforms like Snapchat, and MoonPig, are directly related to APIs, while other security scope at companies like Sony, JP Morgan Chase, and beyond could easily be reduced with a sensible API strategy.
Companies are increasingly operating online, but do not act like any of information lives in an online environment. Adopting an API approach to defining company resources, helps map out this surface area, acknowledging it is available over the Internet, and works to define, secure, and monitor this surface in a healthier way.
Mobile users need access to their data, and by applying an API centric approach, providing account management, data portability, and access and identity controls using oAuth, you can increase transparency, while also strengthening overall security. If your company operations is centered around customer and end-user data transactions, you should be making all data points available via an API, accompanied by a well oiled oAuth layer to help end-users manage their resources, playing a significant role in their own privacy and security.
I'm not delusional in thinking that APIs provide a perfect solution for all our security and privacy woes, it doesn't, but it does set a tone for a more healthy conversation about how companies are doing business on the open Internet, and how we can better secure the online web, mobile, and device-based applications we are increasingly depending on in this new world we have created.
As I read and listen to all of the Internet of Things stories coming out of CES, I’m happy to be hearing discussions around privacy and security, come out of the event. I feel better about IoT security and privacy when I hear things like this, but ultimately I am left with overwhelming concern about of the quantity of IoT devices.
There are many layers to securing IoT devices, and protecting the privacy of IoT users, but I can't help but the think that Internet of Things security and privacy will always begin by asking ourselves if we should be doing this at all. Do we need this object connected to the Internet? Are we truly benefiting from having this item enabled with cloud connectivity?
I'm going to try and keep up with tracking on the API layer being rolled out in support of IoT devices, but not sure I will be able to keep up with the number of devices, and the massive amount of hype around products and services. At some point I may have to tap out, and focus on specific aspects of IoT connectivity ,around what I consider the politics of APIs.
My Response To How Can the Department of Education Increase Innovation, Transparency and Access to Data?02 Jun 2014
I spent considerable time going through the Department of Education RFI, answering each question in as much detail as I possibly could. You can find my full response below. In the end I felt I could provide more value by summarizing my response, eliminating much of the redundancy across different sections of the RFI, and just cut through the bureaucracy as I (and APIs) prefer to do.
Open Data By Default
All publicly available data at the Department of Education needs to be open by default. This is not just a mandate, this is a way of life. There is no data that is available on any Department of Education websites that should not be available for data download. Open data downloads are not separate from existing website efforts at Department of Education, they are the other side of the coin, making the same content and data available in machine readable formats, rather than available via HTML—allowing valuable resources to be used in systems and applications outside of the department’s control.
Open API When There Are Resources
The answer to whether or not the Department of Education should provide APIs is the same as whether or not the agency should deploy websites—YES! Not all individuals and companies will have the resources to download, process, and put downloadable resources to use. In these situations APIs can provide much easier access to open data resources, and when open data resources are exposed as APIs it opens up access to a much wider audience, even non-developers. Lightweight, simple, API access to open data inventory should be default along with data downloads when resources are available. This approach to APIs by default, will act as the training ground for not just 3rd party developers, but also internally, allowing Department of Education staff to learn how to manage APIs in a safe, read-only environment.
Using A Modern API Design, Deployment, and Management Approach
As the usage of the Internet matured in 2000, many leading technology providers like SalesForce and Amazon began using web APIs to make digital assets available to 3rd party partners, and 14 years later there are some very proven approaches to designing, deploying and management APIs. API management is not a new and bleeding edge approach to making assets available in the private sector, there are numerous API tools and services available, and this has begun to extend to the government sector with tools like API Umbrella from NREL, being employed by api.data.gov and other agencies, as well as other tools and services being delivered by 18F from GSA. There are many proven blueprints for the Department of Education to follow when embarking on a complete API strategy across the agency, allowing innovation to occur around specific open data, and other program initiatives, in a safe, proven way.
Use API Service Composition For Maximum Access & Control
One benefit of 14 years of evolution around API design, deployment, and management is the establishment of sophisticated service composition of API resources. Service composition refers to the granular, modular design and deployment of APIs, while being able to manage who has access to these resources. Modern API access is not just direct, public access to a database. API service composition allows for designing exactly the access to resources that is necessary, one that is in alignment with business objectives, while protecting the privacy and security of everyone involved. Additionally service composition allows for real-time awareness of how all data, content, and other resources at the Department of Education are accessed and put to use, allowing new APIs to be designed to support specific needs, and existing APIs to evolved based upon actual demand, not just speculation.
Deeper Understanding Of How Resources Are Used
A modern API service composition layer opens up possibility for a new analytics layer that is not just about measuring and reporting of access to APIs, it is about understanding precisely how resources are accessed in real-time, allowing API design, deployment and management processes to be adjusted in a more rapid and iterative way, that contributes to the roadmap, while providing the maximum enforcement of security and privacy of everyone involved. When the Department of Education internalizes a healthy, agency-wide API approach, a new real-time understanding will replace this very RFI centered process that we are participating in, allowing for a new agility, with more control and flexibility than current approaches. A RFI cycle takes months, and will contain a great deal of speculation about what would be, where API access, coupled with healthy analytics and feedback loops, answers all the questions being addressed in this RFI, in real-time, reducing resource costs, and wasted cycles.
APIs Open Up Synchronous and Asynchronous Communication Channels
Open data downloads represents a broadcast approach to making Department of Education content, data and other resources available, representing a one way street. APIs provide a two-way communication, bringing external partners and vendors closer to Department of Education, while opening up feedback loops with the Department of Education, reducing the distance between the agency and its private sector partners—potentially bringing valuable services closer to students, parents and the companies or institutions that serve them. Feedback loops are much wider currently at the Department of Education occur on annual, monthly and at the speed of email or phone calls , with the closest being in person at events, something that can be a very expensive endeavor. Web APIs provide a real-time, synchronous and asynchronous communication layer that will improve the quality of service between Department of Education and the public, for a much lower cost than traditional approaches.
Building External Ecosystem of Partners
The availability of high value API resources, coupled with a modern approach to API design, deployment and management, an ecosystem of trusted partners can be established, allowing the Department of Education to share the workload with an external partner ecosystem. API service composition allows the agency to open up access to resources to only the partners who have proven they will respect the privacy and security of resources, and be dedicated to augmenting and helping extend the mission of the Department of Education. As referenced in the RFI, think about the ecosystem established by the IRS modernized e-file system, and how the H&R Blocks, and Jackson Hewitt’s of the world help the IRS share the burden of the country's tax system. Where is the trusted ecosystem for the Department of Education? The IRS ecosystem has been in development for over 25 years, something the Department of Education has to get to work on theirs now.
Security Fits In With Existing Website Security Practices
One of the greatest benefits of web APIs is that they utilize existing web technologies that are employed to deploy and manage websites. You don’t need additional security approaches to manage APIs beyond existing websites. Modern web APIs are built on HTTP, just like websites, and security can be addressed right alongside current website security practices—instead of delivering HTML, APIs are delivering JSON and XML. APIs even go further, and by using modern API service composition practices, the Department of Education gains an added layer of security and control, which introduces granular levels of access to all resource, something that does not exist for website. With a sensible analytics layer, API security isn’t just about locking down, it is about understanding who is access resources, how they are using them, striking a balance between the security and access of resources, which is the hallmark of APIs.
oAuth Gives Identity and Access Control To The Student
Beyond basic web security, and the heightened level of control modern API management deliver, there is a 3rd layer to the security and privacy layer of APis that does not exist anywhere else—oAuth. Open Authentication or oAuth provides and identity and access layer on top of API that gives end-users, and owner of personal data control over who access their data. Technology leaders in the private sector are all using oAuth to give platform users control over how their data is used in applications and systems. oAuth is the heartbeat of API security, giving API platforms a way to manage security, and how 3rd party developers access and put resources to use, in a way that gives control to end users. In the case of the Department of Education APIs, this means putting the parent and student at the center of who accesses, and uses their personal data, something that is essential to the future of the Department of Education.
How Will Policy Be Changed?
I'm not a policy wonk, nor will I ever be one. One thing I do know is you will never understand the policy implications in one RFI, nor will you change policy to allow for API innovation in one broad stroke--you will fail. Policy will have to be changed incrementally, a process that fits nicely with the iterative, evolutionary life cyce of API managment. The cultural change at Department of Education, as well as evolutionary policy change at the federal level will be the biggest benefits of APIs at the Department of Education.
An Active API Platform At Department of Education Would Deliver What This RFI Is Looking For
I know it is hard for the Department of Education to see APIs as something more than a technical implementation, and you want to know, understand and plan everything ahead of time—this is baked into the risk averse DNA of government. Even with this understanding, as I go through the RFI, I can’t help but be frustrated by the redundancy, bureaucracy, over planning, and waste that is present in this process. An active API platform would answer every one of your questions you pose, with much more precision than any RFI can ever deliver.
If the Department of Education had already begun evolving an API platform for all open data sets currently available on data.gov, the agency would have the experience in API design, deployment and management to address 60% of the concerns posed by this RFI. Additionally the agency would be receiving feedback from existing integrators about what they need, who they are, and what they are building to better serve students and institutions. Because this does not exist there will be much speculation about who will use Department of Education APIs, and how they will use them and better serve students. While much of this feedback will be well meaning, it will not be rooted in actual use cases, applications and existing implementations. An active API ecosystem answers these questions, while keeping answers rooted in actual integrations, centered around specific resources, and actual next steps for real world applications.
The learning that occurs from managing read-only API access, to low-level data, content and resources would provide the education and iteration necessary for the key staff at Department of Education to reach the next level, which would be read / write APIs, complete with oAuth level security, which would be the holy grail in serving students and achieving the mission of the Department of Education. I know I’m biased, because of my focus on APIs, but read / write access to all Department of Education resources over the web and via mobile devices, that gives full control to students, is the future of the agency. There is no "should we do APIs", there is only the how, and I’m afraid we are wasting time, and we need to just do it, and learn to ask these questions along the way.
There is proven technology and processes available to make all Department of Education data, content and resources available, allowing both read and write access in a secure way, that is centered around the student. The private sector is 14 years ahead of the government in delivering private sector resources in this way, and other government agencies are ahead of the Department of Education in doing this as well, but there is an opportunity for the agency to still lead and take action, by committing the resources necessary to not just deploy a single API, but internalize APIs in a way that will change the way learning occurs in the coming decades across all US institutions.
A. Information Gaps and Needs in Accessing Current Data and Aid Programs
1. How could data sets that are already publicly available be made more accessible using APIs? Are there specific data sets that are already available that would be most likely to inform consumer choice about college affordability and performance?
Not everyone has the resources download, process and put open datasets to use. APIs can make all of the publicly available datasets more available to the public, allowing for easy URL access, deployment of widgets, visualizations as well as integration with existing tools like Microsoft Excel. All datasets should have option of being published in this way, but ultimately the Dept. of Ed API ecosystem should speak to which datasets would be most high value, and warrant API access.
2. How could APIs help people with successfully and accurately completing forms associated with any of the following processes: FAFSA; Master Promissory Note; Loan Consolidation; entrance and exit counseling; Income-Driven Repayment (IDR) programs,ï¿¼ 15 such as Pay As You Earn; and the Public Student Loan Forgiveness program?
APIs will help decouple each data point on a form. Introductory information, each questions, and other supporting resources can be broken up and delivered via any website, and mobile applications. Evolving a form into a linear, 2-dimensional form into an interactive application that people can engage with, providing the assistance needed to properly achieve the goals surrounding a form.
Each form initiative will have its own needs, and a consistent API platform and strategy from the department of Education will help identify each forms unique requirements, and the custom delivery of just the resources that are needed for a forms target audience.
3. What gaps are there with loan counseling and financial literacy and awareness that could be addressed through the use of APIs to provide access to government resources and content?
First, APIs can provide access to the content that educates students about the path they are about to embark on, before they do, via web and mobile apps they frequent already, not being required to visit the source site and learn. Putting the information students need into their hands, via their mobile devices will increase the reach of content and increase the chances that students will consume.
Second, APIs plus oAuth will give students access over their own educational finances, forcing them to better consider how they will manage all the relationships they enter into, the details of loans, grants and with the schools they attend. With more control over data and content, will come a forced responsibility in understanding and managing their finances.
Third, this process will open up students eyes to the wider world of online data and information, and that APIs are driving all aspects of their financial life from their banking and credit cards to managing their online credit score.
APIs are at the heart of all of the API driven digital economy, the gift that would be given to students when they first leave home, in the form of API literacy would carry with them throughout their lives, allowing them to better manage all aspects of their online and financial lives—and the Department of Education gave them that start.
4. What services that are currently provided by title IV student loan servicers could be enhanced through APIs (e.g., deferment, forbearance, forgiveness, cancellation, discharge, payments)?
A consistent API platform and strategy from the department of Education would provide the evolution of a suite of verified partners, such as title IV student loan services. A well planned partner layer within an ecosystem would allow student loan services to access data from students in real-time, with students having a say in who and how they have access to the data. These dynamics introduced by, and unique to API platforms that employ oAuth, provide new opportunities for partnerships to be established, evolve and even be terminated when not going well.
API platform using oAuth provide a unique 3-legged relationship between the data platform, 3rd party service providers and students (users), that can be adopted to bring in existing industry partners, but more importantly provide a rich environment for new types of partners to evolve, that can improve the overall process and workflow a student experiences.
5. What current forms or programs that already reach prospective students or borrowers in distress could be expanded to include broader affordability or financial literacy information?
All government forms and programs should be evaluated for the pros / cons of an API program. My argument within this RFI response will be focused on a consistent API platform and strategy from the department of Education. APIs should be be part of every existing program change, and new initiatives in the future.
B. Potential Needs to be Filled by APIs
1. If APIs were available, what types of individuals, organizations, and companies would build tools to help increase access to programs to make college more affordable?
A consistent API platform and strategy from the department of Education will have two essential components, partner framework, and service composition. A partner framework defines which external, 3rd party groups can work with Department of Education API resources. The service composition defines how these 3rd party groups can can access and ultimately use Department of Education API resources.
All existing groups that the Department of Education interacts with currently should be evaluated for where in the API partner framework they exists, defining levels of access for general public, student up to certified and trusted developer and business partnerships.
The partner framework and service composition for the Department of Education API platform should be applied to all existing individuals, organizations and companies, while also allow for new actors to enter the game, and potentially redefining the partner framework and add new formulas for API service composition, opening up the possibilities for innovation around Department of Education API resources.
2. What applications and features might developers, schools, organizations, and companies take interest in building using APIs in higher education data and services?
As with which Department of Education forms and programs might have APIs apply, which individuals, organizations and companies will use APIs, the only way to truly understand what applications might developers, schools, organizations and companies put APIs cannot be know, until it is place. These are the questions an API centric company or institution asks of its API platform in real-time. You can’t define who will use an API and how they will use it, it takes iteration and exploration before successful applications will emerge.
3. What specific ways could APIs be used in financial aid processes (e.g., translation of financial aid forms into other languages, integration of data collection into school or State forms)?
When a resource is available via an API, it is broken down into the smallest possible parts and pieces possible, allowing them to be re-used, and re-purposed into every possible configuration management. When you make form questions independently available via an API, it allows you to possible reorder, translate, and ask in new ways.
This approach works well with forms, allowing each entry of a form to be accessible, transferable, and open up for access, with the proper permissions and access level that is owned by the person who owns the format data. This opens up not just the financial aid process, but all form processes to interoperate with other systems, forms, agencies and companies.
With the newfound modularity and interoperability introduced by APIs, the financial aid process could be broken down, allowing parents to take part for their role, schools for theirs, and allow multiple agencies to be engaged such as IRS or Department of Veterans Affairs (VA). All of this allows any involved entity or system to do its part for the financial aid process, minimizing the friction throughout the entire form process, even year over year.
4. How can third-party organizations use APIs to better target services and information to low-income students, first-generation students, non-English speakers, and students with disabilities?
Again, this is a questions that should be asked in real-time of a Department of Education platform. Examples of how 3rd party organizations can better target services and information to students, is the reason for an API platform. There is no way to no this ahead of time, I will leave to domain experts to attempt at answering.
5. Would APIs for higher education data, processes, programs or services be useful in enhancing wraparound support service models? What other types of services could be integrated with higher education APIs?
A sensibly design,deployed, managed and evangelized API platform would establish a rich environment for existing educational services to be augmented, but also allow for entirely new types of services to be defined. Again I will leave to domain experts to speak of specific service implantations based upon their goals, and understanding of the space.
C. Existing Federal and Non-Federal Tools Utilizing APIs
1. What private-sector or non-Federal entities currently offer assistance with higher education data and student aid programs and processes by using APIs? How could these be enhanced by the Department’s enabling of additional APIs?
There are almost 10K public APIs available in the private sector. This should be viewed as a pallet for developers, and something that developers use as they are developing (painting) their apps (painting). It is difficult for developers to know what they will be painting with, without knowing what resources are available. The open API innovation process rarely is able to articulate what is needed, then make that request for resources—API innovations occurs when valuable, granular resources are available fro multiple sources, ad developers assemble them, and innovate in new ways.
2. What private-sector or non-Federal entities currently work with government programs and services to help people fill out government forms? Has that outreach served the public and advanced public interests?
Another question that should be answered by the Department of of Education, and providing us with the answers. How would you know this without a properly definitely partner framework? Stand up an API platform, and you will have the answer.
3. What instances or examples are there of companies charging fees to assist consumers in completing otherwise freely available government forms from other agencies? What are the advantages and risks to consider when deciding to allow third parties to charge fees to provide assistance with otherwise freely available forms and processes? How can any risks be mitigated?
I can't speak to what is already going on in the space, regarding companies charging feeds to consumers, I am not expert on the education space at this level. This is just such a new paradigm made possible via APIs and open data, there just aren’t that many examples in the space, built around open government data.
First, the partner tiers of API platforms help verify and validate individuals and organizations who are building applications and charging for services in the space. A properly design, managed and policed partner tier can assist in mitigating risk in the evolution of such business ecosystems.
Second API driven security layers using oAuth give access to end-users, allowing students to take control over which applications and ultimately service providers have access to their data, revoking when services are done or a provider is undesirable. With proper reporting and rating systems, policing of the API platform can be something that is done within the community, and the last mile of policing being done by the Department of Education.
Proper API management practices provide the necessary identity, access and control layers necessary to keep resources and end-users safe. Ultimately who has access to data, can charge fees, and play a role in the ecosystem is up to Department of education and end-users when applications are built on top of APIs.
4. Beyond the IRS e-filing example, what other similar examples exist where Federal, State, or local government entities have used APIs to share government data or facilitate participation in government services or processes - particularly at a scale as large as that of the Federal Student Aid programs?
This is a new, fast growing sector, and there are not a lot of existing examples, but there area few:
An API driven system that allows citizens to report and interact with municipalities around issues within communities. While Open311 is deployed in specific cities such as Chicago and Baltimore, it is an open source platform and API that can be deployed to serve any size market.
The US Census provides open data and APIs, allowing for innovation around government census survey data, used across the private sector in journalism, healthcare, and many other ways. The availability of government census data is continually spawning new applications, visualizations and other expressions, that wouldn’t be realized or known, if the platform wasn’t available.
We The People
The We The People API allows for 3rd-Party integration with the White House Petition process. Currently only allowing for read only access to the information, and the petition process, but is possibly one way that write APIs will emerge in federal government.
There are numerous examples of open APIs and data being deployed in government, even from the Department of Education. All of them are works in progress, and will realize their full potential over time, maturation and much iteration and engagement with the public.
D. Technical Specifications
1. What elements would a read-write API need to include for successful use at the Department?
There are numerous building blocks can be employed in managing read-write APIs, but there are a couple that will be essential to successful read-write APIs in government:
Defined access tiers for consumers of API data, with appropriate public, partner and private (internal) levels of access. All write methods are only accessible by partner and internal levels of access, requiring verification and certification of companies and individuals who will be building on top of API resources.
The ability to compose many different types of API resource access, create service bundles that are made accessible to different levels of partners. Service management allows for identity and access management, but also billing, reporting, and other granular level control over how services are composed, accessed and managed.
Open Authentication (oAuth 2.0)
All data made available via Department of Education API platforms and involves personally identifiable information will require the implementation of an open authentication or oAuth security layer. oAuth 2.0 provides an identity layer for the platform, requiring developers to use token that throttle access to resources for applications, a process that is initiated, managed and revoked by end-users—providing the highest level of control over who has access to data, and what they can do with it, by the people who personal data is involved.
Federated API Deployments
Not all APIs should be deployed and managed within the Department of Education firewall. API platforms can be made open source so that 3rd party partners can deploy within their own environments. Then via a sensible partner framework, the Department of Education can decide which partners they should not just allow to write to APIs, but also pull data from their trusted systems and open API deployments.
APIs provide the necessary access to all of federal government API resources, and a sensible partner framework, service management layer in conjunction with oAuth will provide the necessary controls for a read / write API in government. If agencies are looking to further push risk outside the firewall, federated API deployments with trusted partners will have to be employed.
2. What data, methods, and other features must an API contain in order to develop apps accessing Department data or enhancing Department processes, programs, or services?
There are about 75 common building blocks for API deployments (http://management.apievangelist.com/building-blocks.html), aggregated after looking at almost 10K public API deployments. Each government API will have different needs when it comes to other supporting building blocks.
3. How would read-only and/or read-write APIs interact with or modify the performance of the Department’s existing systems (e.g., FAFSA on the Web)? Could these APIs negatively or positively affect the current operating capability of such systems? Would these APIs allow for the flexibility to evolve seamlessly with the Department’s technological developments?
There are always risks with API access to resources, but a partner framework, service management, oAuth, and other common web security practices these risks can be drastically reduce, and mitigated in real-time
Isolated API Deployments
New APIs should rarely be deployed and directly connected to existing systems. APIs can be deployed as an isolated interface, with an isolated data store. Existing systems can use the same API interface to read / write data into the system and keep in sync with existing internal systems. API developers will never have access to existing system and data stores, just isolated, defined API interfaces as part of a secure partner tier, only accessing the services they have permission to, and the end-user data that has been given access to by end-users themselves.
As described above, if government agencies are looking to further reduce risk, API deployments can be designed and deployed as open source software, allowing partners with the ecosystem to download and deploy. A platform partner framework can provide a verification and certification process for federal API deployments, allowing the Department of Education to decide who they will pull data from, reducing the risk to internal systems, providing a layer of trust for integration.
Beyond these approaches to deploying APIs, one of the biggest benefits of web API deployments is they use the same security as other government websites, just possessing an additional layer of securing determining who has access, and to what.
It should be the rare instance when an existing system will have an API deployed with direct integration. API automation will provide the ability to sync API deployments with existing systems and data stores.
4. What vulnerabilities might read-write APIs introduce for the security of the underlying databases the Department currently uses?
As stated above, there should be no compromise in how data is imported into existing databases at the Department of Education. It is up to the agency to decide which APIs they pull data from, and how it is updated as part of existing systems.
5. What are the potential adverse effects on successful operation of the Department’s underlying databases that read-write APIs might cause? How could APIs be developed to avoid these adverse effects?
As stated above, isolated and external, federated API deployments will decouple the risk from existing systems. This is the benefit of APIs, is they can deployed as isolated resources, then integration and interoperability, internally and externally is up to the consumer to decide what is imported and what isn’t.
6. How should APIs address application-to-API security?
Modern API partner framework, service management and oath provide the necessary layer to identify who has access, and what resources can be used by not just a company and user, but by each application they have developed.
Routing all API access through the partner framework plus associated service level, will secure access to Department of Education resources by applications, with user and app level logging of what was accessed and used within an application.
OAuth provides a balance to this application to API security layer, allowing the Department of Education to manage security of API access, developers to request access for their applications, but ultimately control is in the hand of end users to define which applications have access to their data.
7. How should the APIs address API-to-backend security issues? Examples include but are not limited to authentication, authorization, policy enforcement, traffic management, logging and auditing, TLS (Transport Layer Security), DDoS (distributed denial-of-service) prevention, rate limiting, quotas, payload protection, Virtual Private Networks, firewalls, and analytics.
Web APIs use the exact same infrastructure as websites, allowing for the re-use of existing security practices employed for websites. However APIs provide the added layer of security, logging, auditing and analytics provided through the lens of the partner framework, service composition and only limited by the service management tooling available.
8. How do private or non-governmental organizations optimize the presentation layer for completion and accuracy of forms?
Business rules. As demonstrated as part of a FAFSA API prototype, business rules for each form field, along with rejection codes can also be made available via an API resources, allowing for developers to build in a form validation layer into all digital forms.
After submission, and the first line of defense provide red by API developers building next generation forms, platform providers can provide further validation, review and ultimately a status workflow that allows forms to be rejected or accepted based upon business logic.
9. What security parameters are essential in ensuring there is no misuse, data mining, fraud, or misrepresentation propagated through use of read- only or read-write APIs?
A modern API service management layer allows the platform provider to see all API resources that are being access, by whom, and easily establish patterns for healthy usage, as well as patterns for misuse. When misuse is identified, service management allows providers to revoke access, and take action against companies and individuals.
Beyond the platform provider, APIs allow for management by end-users through common oAuth flows and management tools. Sometimes end-users can identify an app is misusing their data, even before a platform provider might. oAuth gives them the control to revoke access to their data, via the API platform.
oauth, combined with API service management tooling has allowed for a unique security environment in which the platform can easily keep operations healthy, but end-users and developers can help police the ecosystem as well. If platform providers give users the proper rating and reporting tools, they can help keep API and data consumers in check.
10. With advantages already built into the Department’s own products and services (e.g., IRS data retrieval using FAFSA on the Web), how would new, third-party API-driven products present advantages over existing Department resources?
While existing products and services developed within the department do provide great value, the Department of Education cannot do everything on their own. Because of the access the Department has, some features will be better by default, but this won’t be the case in all situations.
The Department of Education and our government does not have unlimited resources, and with access to ALL resources available via the department the private sector can innovate, helping share the load of delivering vital services. Its not whether or not public sector products and services are better than private sector or vice vera, it is about the public sector and private sector partnering wherever and whenever it make sense.
11. What would an app, service or tool built with read-write API access to student aid forms look like?
Applications will look like turbotax and tax act developed within the IRS ecosystem, and look like the tools developed by the Sunlight Foundation on top of government open data and APIs.
We will never understand what applications are possible until the necessary government resources are available. All digital assets should be open by default, with consistent API platform and strategy from the department of Education, and the platform will answer this question.
E. Privacy Issues
1. How could the Department use APIs that involve the use of student records while ensuring compliance with potentially applicable statutory and regulatory requirements, such as the Family Educational Rights and Privacy Act (20 U.S.C. § 1232g; 34 CFR Part 99) and the Privacy Act (5 U.S.C. § 552a and 34 CFR Part 5b)?
As described above the partner framework, service management and oAuth layer provides the control and logging necessary to execute and audit as part of any application statutory and regulatory requirement.
I can’t articulate enough how this layer provides a tremendous amount of control over how these resources are access, giving control to the involved parties who matter the most—end-users. All API traffic is throttled, measured and reviewed as part of service management, enforcing privacy that in a partnership between the Department of Education, API consumers and end-users.
2. How could APIs ensure that the appropriate individual has provided proper consent to permit the release of privacy-protected data to a third party? How can student data be properly safeguarded to prevent its release and use by third parties without the written consent often required?
As articulated above the partner framework, service management and oAuth address this. This is a benefit of API deployment, breaking down existing digital access, providing access and granular control, combined with oAuth and logging of all access—APIs take control to a new level.
oAuth has come to represent this new balance in security and control of digital resources, allowing the platform, developers and end-users to execute within their defined role on the platform. This balance introduced by APIs and oAuth, allow data to be safeguarded, while also opening up for the widest possible use in the next generation applications and other implementations.
3. How might read-only or read-write APIs collect, document, and track individuals’ consent to have their information shared with specific third parties?
4. How can personally identifiable information (PII) and other financial information (of students and parents) be safeguarded through the use of APIs?
Access of personally identifiable information (PII) via Department of Education APIs will be controlled by students and their parents. The most important thing you can do to protect PII is to give the owner of that data, education about how to allow developer access to it in responsible ways that will benefit them.
APIs open up access, while oAuth will give the students and parents the control they need to integrate with apps, and existing system to achieve their goals, while retaining the greatest amount of over safeguarding their own data.
5. What specific terms of service should be enabled using API keys, which would limit use of APIs to approved users, to ensure that information is not transmitted to or accessed by unauthorized parties?
A well designed partner layer would define multiple level of access, combined with sensible service packages, will establish the terms of service levels that will be bundled with API keys and oAuth level identity and access to personally identifiable information.
Common approaches to deploying partner layers with appropriate service tiers, using oAuth have been well established over the last 10 years in the private sector. Controlling access to API resources at a granular level, providing the greatest amount of access that makes sense, while knowing who is access data and how they are using is what APIs are designed for.
6. What are the relative privacy-related advantages and disadvantages of using read-only versus read-write APIs for student aid data?
You will face many of the similar privacy concerns whether an API is read or write. If it is personably identifiable information, read or write access to the wrong parties violates a student's privacy. Just ensure that data is updated via trusted application providers is essential.
A properly defined partner layer will separate who has read and who has write access. Proper logging and versioning of data is essential to ensure data integrity, allowing end-users to manage their data via an application or system with confidence.
F. Compliance Issues
1. What are the relative compliance-related advantages and disadvantages of using read-only versus read-write APIs for student aid data?
APIs provide a single point of access to student aid data. With the implementation of proper partner framework, service management and oAuth every single action via this doorway is controlled and logged. When it comes to auditing ALL operations whether it is from the public, partners or internal, APIs excel in satisfying compliance concerns.
2. How can the Department prevent unauthorized use and the development of unauthorized products from occurring through the potential development of APIs? How might the Department enforce terms of service for API key holders, and prevent abuse and fraud by non-API key holders, if APIs were to be developed and made available?
As described above the partner framework, service management and oAuth will provide the security layer needed to manage 99% of potential abuse, but overall enforcement via the API platform is a partnership between the Department of Education, API consumers as well as end-users. The last mile of enforcement will be executed by the Department of Education, but it will be up to the entire ecosystem and platform to police and enforce in real-time.
3. What kind of burden on the Department is associated with enforcing terms and conditions related to APIs?
The Department of Education will handle the first line of defense, in defining partner tiers and service composition that wraps all access to APis. The Department will also be the last mile of decision making and enforcement when violations occur. The platform should provide the data needed by the department to make decision as well as the enforcement necessary in the form of API key and access revocation, and banning apps, individuals and business from the ecosystem.
4. How can the Department best ensure that API key holders follow all statutory and regulatory provisions of accessing federal student aid funds and data through use of third-party products?
First line of define to ensure that API key holders follow all statutory and regulatory provision will be verification and validation of partners upon registration, applications going into production and availability in application galleries and other directories in which students discover apps.
Second line of defense will be reporting requirements and usage patterns of API consumers and their apps. If applications regular meet self-reporting requirements and real-time patterns establishing healthy or unhealthy behavior, they can retain their certification. If partners fail to comply they will be restricted from the API ecosystem.
Last line of defense is the end-users, the students and parents. All end-users need to be educated regarding the control they have, given reporting and ranking tools that allow them file complaints and rank the applications that are providing quality services.
As stated several times, enforcement will be a community effort, something the Department of Education has ultimate control of, but requires giving the community agency as well.
5. How could prior consent from the student whom the data is about be provided for release of privacy- protected data to third party entities?
An API with oAuth layer is this vehicle. Providing the access, logging all transactions, and holding all partners to a quality of service. All the mechanism are there, in a modern API implementation, the access just needs to be defined.
6. How should a legal relationship between the Department and an API developer or any other interested party be structured?
I’m not a lawyer. I’m not a policy person. Just can’t contribute to this one.
7. How would a legal relationship between the Department and an API developer or any other interested party affect the Department’s current agreements with third-party vendors that operate and maintain the Department’s existing systems?
All of this will be defined in each partner tier, combined with appropriate service levels. With isolated API deployments, this should not affect currently implementations.
However a benefit of consistent API strategy is that existing vendors can access resources via APis, increasing the agility and flexibility of existing contracts. APIs are a single point of access, not just for the public, but 3rd party partners as well as internal access. Everyone involved can participate and receive benefits of API consumption.
8. What disclosures should be made available to students about what services are freely available in government domains versus those that could be offered at a cost by a third party?
A partner tier for the API platform will define the different levels of partners. Trusted, verified and certified partners will get different recommendation levels and access than lesser known services, and applications from 3rd party with lesser trusted levels of access.
9. If the Department were to use a third-party application to engage with the public on its behalf, how could the Department ensure that the Department follows the protocols of OMB Memorandum 10-23?
Again, the partner tier determines the level of access to the partner and the protocols of all OMB memorandum call be built in. Requiring all data, APIs and code is open sourced, and uses appropriate API access tiers showing how data and resources are accessed and put to use.
API service management provides the reporting necessary to support government audits and regulations. Without this level of control on top of an API, this just isn’t possible in a scalable way, that APIs plus web and mobile applications offer.
G. Policy Issues
1. What benefits to consumers or the Department would be realized by opening what is currently a free and single-point service (e.g., the FAFSA) to other entities, including those who may charge fees for freely-available services and processes? What are the potential unintended consequences?
Providing API access to government resources is an efficient and sensible use of taxpayers money, and reflect the mission of all agencies, not just the Department of Education. APIs introduce the agility and flexibility needed to deliver the next generation government application and services.
The economy in a digital age will require a real-time partnership between the public sector and the private sector, and APIs are the vehicle for this. Much like it has done for private sector companies like Amazon and Google, APIs will allow the government to create new services and products that serve constituents with the help of the private sector, while also stimulating job growth and other aspects of the economy.
APIs will not all be an up-side, each program and initiative will have its own policy problems and unintended consequences. One problem that plagues API initiatives is enough resources in the form of money and skilled works to make sure efforts are successful. Without the proper management, poorly executed APIs can open up huge security holes, introduce privacy concerns at a scale never imagined.
APIs need to be managed properly, with sensible real-time controls for keeping operations in check.
2. How could the Department ensure that access to title IV, HEA student aid programs truly remains free, even amidst the potential development of third-party apps that may charge a fee for assistance in participating in free government programs, products, and services with or without providing legitimate value-added services?
Partner Framework + Service Management = Quality of Service Across Platform
3. What other policy concerns should the Department consider with regard to the potential development of APIs for higher education data and student aid processes at the Department?
Not a policy or education expert, I will leave this to others to determine. Also something that should be built into API operations, and discovered on a program by program basis.
4. How would APIs best interact with other systems already in use in student aid processes (e.g., within States)?
The only way you will know is if you do it. How is the IRS-efile system helping with this, but it isn’t even a perfect model to follow. We will never know the potential here until a platform is stood up, and resources are made available. All signs point to APIs opening up a huge amount of interoperability between not just states and the federal government, but also with cities and counties.
5. How would Department APIs benefit or burden institutions participating in title IV, HEA programs?
If APIs aren’t given the proper resources to operate it can introduce security, privacy and support concerns that would not have been there before. A properly run API initiative will provide support, while an underfunded, undermanned initiative will just further burden institutions.
6. While the Department continues to enhance and refine its own processes and products (e.g., through improvements to FAFSA or the IDR application process), how would third-party efforts using APIs complement or present challenges to these processes?
These two things should not be separate. The internal efforts should be seen as just another partner layer within the API ecosystem. All future service and products developed internally within the Department of Education should use the same API infrastructure developed for partners and the public.
If APIs are not used internally, API efforts will always fail. APIs are not just about providing access to external resources, it is about opening up the Department to think about its resources in an external way that benefits the public, partners as well as within the government.
If you think there is a link I should have listed here feel free to tweet it at me, or submit as a Github issue. Even though I do this full time, I'm still a one person show, and I miss quite a bit, and depend on my network to help me know what is going on.