User-Selected Storage


#1

I started this as a response to the meeting notes from this meeting a month ago, but decided this deserved its own post.

I’d really love to get a clear view on what works now as far as configuring your own gaia hub. What I mean by this is, what can a power user do now and will it actually work end-to-end? And what can a regular user do now?

I’m not sure where to track progress on user-selected storage as there are a variety of issues open tied to this:







Let’s start with power users, going from easiest process to hardest:

One-Click Heroku Deploy

A power user can pretty easily configure a Heroku hosted hub with this method, but the documentation around next steps are sparse and based on multiple forum posts and Slack posts, it seems that even if you do get this up and running, it won’t matter because apps can’t access an updated gaia hub url. Someone please let me know if this is no longer the case.

Configuring a Different Hub (S3, Azure, Google Cloud)

The instructions for doing so are documented here: https://github.com/blockstack/gaia/blob/master/hub/README.md

However, it seems as mentioned above, that even successful configuration (which is no small task) results in apps still pointing to the default gaia hub: URL for Gaia Hub Connection isn't changed after save API Settings


Now, let’s move on to our non-power users. The ones that will actually be using the apps once devs get them out of alpha and beta. None of them will be doing the above, and presumably, all of them care about this promise:

In the meeting a month ago, it was agreed that users need to be able to select the storage provider of their choice:

These users are completely out of luck and do not own their data at all. This presents significant challenges for app adoption. For apps like Graphite and Misthos and Stealthy that have exited or are exiting the early stage of user traction and experimentation, the lack of user-controlled storage is a deal-breaker. I’ll give you an example.

Graphite is working on a significant enterprise deal with a big name journalism shop. However, the main selling point (user-owned, private data) is not available. Using just Blockstack, it is possible this deal cannot be made.


I know this has been brought up a lot (by me mostly) and I know there is work being done. But it’s not clear how well this work is being prioritized and where the best place to track status is. I’d love it if this forum post could be the place where high level updates end up, but if it needs to all be in github, just point us to the right issues to watch.

Ideally, I’d like to know the time period in which users (non-power users) can select their own storage providers. I need to determine whether or not I have to build my own solution outside of Blockstack for this.

cc: @jude @aaron @muneeb @ryan


#2

Hi @jehunter5811,

First, thank you for putting all of these issues together in one place. I went ahead and opened up a bunch of issues related to both this post, other forum posts, and previous meeting notes, and organized them into a kanban here: https://github.com/orgs/blockstack/projects/27

I also wrote up some notes from the aforementioned engineering meeting that describes how we can do per-application Gaia hubs here. While a profile object currently encodes a user’s per-app Gaia URLs for reads, it does not yet do so for writes (which will be necessary to support per-app Gaia hubs). However, I think this could be unlocked for power users (i.e. CLI users) relatively quickly. Pending feedback from @larry, @aaron, and @yukan, I’m willing to take a crack it.

Would the above proposal address your concerns? I apologize that the needle hasn’t moved very much in the past month—I’m currently tied up on the feature/token-v1 branch of Blockstack Core, for example. However, I think I’ll have some free cycles this weekend.

Best,
Jude


#3

@jude, not able to find https://github.com/orgs/blockstack/projects/27


#4

Hi Jude,

Thanks for the reply! I know everyone is working hard on Blockstack’s Milestone 1 goals, so I appreciate you reviewing this.

I think your proposed solution sounds great for power users, and I might be able to use that in working through enterprise deals. But I can imagine it still being a deal-breaker without having an option for non-power users to choose their own storage providers. Let me give you an example (without using real client prospects):

The Graphite Times is interested in using Graphite. They have a team of 15 journalists and eight editors. The Graphite Times currently stores all their data in a number of locations (based largely on what’s convenient to their reporters and sources). They want to ensure that the data can still be conveniently stored, but they want it to be verifiably somewhere each user controls, like Google Drive or Box or Dropbox, for example.

Me, as the enterprise sales person and founder, might be able to visit each journalist and editor in person or through video conferencing and help them configure their own gaia hubs, but with 23 people needing help, that quickly becomes unmanageable. Then, these journalists will also want their sources to be able to easily use Graphite and to be able to select storage providers they feel safe using (maybe Azure is blocked in their country, for example). That would require the Graphite Times jumping on a call with me, giving me their source’s contact information, putting me in touch with their source, and then me helping that person configure their gaia hub.

That last example is never going to happen. I can’t imagine a reporter who protects sources at the risk of their own personal freedom in some cases, just giving me the source’s contact info to help configure a gaia hub.

Hope that makes sense!


#5

Should be fixed now. Thanks!


#6

Just wanted to give an update this week on things I was able to work on:

  • I sent this PR to the Gaia repository. It adds the ability to list files, and paves the way to being able to migrate data between Gaia hubs.

  • I modified the CLI with new commands: get_app_keys, gaia_getfile, gaia_putfile, and gaia_listfiles. This will let you programmatically interact with your Gaia hub and, with some shell scripting, download all of your Gaia data. Note that gaia_listfiles only works with the Gaia hub PR above.

This week, I’m going to try to modify the Browser so it will honor my proposed hubUrls directive in your profile, which will allow you to specify app-specific Gaia hubs (I’ll add an option to the CLI that lets you manipulate them). Once my Gaia PR gets merged and once the Browser team is happy with the above, we can move forward on adding a GUI for adding/removing per-app Gaia hubs.


#7

This is great! Thanks for the update, Jude.


#8

Hey @jude. I just pulled the CLI to get the updates, but I don’t see documentation of these new commands. Did you move the CLI repo?


#9

Re-surfacing this thread because I have an implementation plan to supplement the eventual release of easy user storage configuration. BUT I don’t want to build it if in, like, two weeks Blockstack is going to release an update that allows true user-selected storage.

Here’s my planned solution:

User is prompted to continue using Default storage OR pick from available storage providers (yet to be determined who those storage providers will be).

If the user chooses a custom storage provider, they will be walked through the process of connecting to that provider (OAuth or API Key). In either case, two files will still be written to default storages:

  1. An indicator file that says whether or not the user is using default storage (this will be a public file) as well as the read path for that storage provider.
  2. A file with the necessary connection information (access token, refresh token, etc), encrypted with the user’s app public key.

This will allow for multi-player storage to still work very much like it works today, but will give the user more control and visibility into their actual data. They’d still need to rely on default storage for those pointer files, but that’s a minimal ask compared to storing all their actual data.

So, back to my original question in this post: How close is user-selected storage. Is it, in a similar easy-to-use implementation as I outlined here, something Blockstack expects to deliver in Q4. Or will anything delivered in Q4 still require quite a bit of technical knowledge for the end user.

Thanks!


#10

Easy user-selected storage is definitely on the roadmap for Q4, but not in the next two weeks.

This is an interesting idea. I think what you’re proposing is an “HTTP 302”-like protocol for Gaia – when the user authenticates to Gaia hub A as part of handlePendingSignIn(), Graphite immediately checks to see if there’s an “indicator file” in the bucket to Gaia hub B. If so, it transparently re-authenticates to Gaia hub B and sets the Gaia authentication token to point to Gaia hub B instead (using the same app private key). This assumes that Gaia hub B is already set up somewhere. Is my understanding correct?


#11

That’s not exactly what I’m talking about with my plan. I think that makes sense for a Blockstack plan, but for me, I’m planning to just use the existing Gaia storage for any given user to store a pointer file. So it might look like this:

{
  customStorage: true,
  provider: "S3"
}

That would then give me enough information to look up the second file stored in the user’s default Gaia which might look like:

{
  s3ReadPath: $somereadpathURLhere,
  s3WritePath: $somewritepathURLhere,
  s3clientId: $clientID,
  s3clientSecret: $clientSecret
}

That’s a very basic example, but it can be extended for really any storage provider I want. If I want to use Dropbox with OAuth, instead of the client_id and client_secret, I would store the accessToken and the refreshToken.

This would all be encrypted in the user’s default Gaia hub and would then be used to make the GET and POST requests to the read paths and write paths provided.


#12

I’ve actually tested this method with Dropbox and it seems to work well. The only problem will be figuring out how to make it work easily with multi-player storage. And perhaps initially the answer is that Default Storage through Gaia acts as a replication layer AND as the primary layer for sharing with others.


#13

Ah, I see. Yeah, making your strategy work in a multiplayer setting is going to be tough since there’s no built-in way for users to securely discover and read files from Dropbox and S3. It could be done if you went through the effort of making the URLs to these files discoverable in the “main” Gaia hub, but you’d need to modify both the read and the write paths to achieve this.

EDIT: not S3, but Google Drive, Microsoft Onedrive, etc. – services where the program can’t choose or construct the read URL.


#14

In toying with this further, I think multiplayer storage isn’t all that difficult. S3, Azure, Dropbox, Google Drive, etc all offer public read paths. I should be able to use those the same way I would with Gaia now.

Again, this is still using Gaia’s default storage to point to all these paths, so it doesn’t solve the problem of “what happens is Azure goes down”, but it does give people true visibility and ownership into their files, which is a big first step.


#15

Last I checked, Dropbox and Google Drive didn’t have public read paths. Has this changed? If so, we should close this issue.


#16

It’s not as simple as just saying there’s a read path, but with a little bit of scripting public read paths are available. The encrypted file in Dropbox or Google Drive just need a public share link created. Here’s the Dropbox function:

https://dropbox.github.io/dropbox-sdk-js/global.html#SharingCreateSharedLinkArg

That will return the existing link or create a new one. So, storing that link in a well-known location would allow for multiplayer reads from Dropbox.


#17

Sorry, bad word choice on my part (coffee hasn’t kicked in yet).

What I meant to ask was whether or not Dropbox lets you have programmatically-controlled public URLs – like https://dropbox.com/myapp/path/to/file.txt. If Dropbox chooses the URL instead of your client, then the client doing the writes needs to maintain an index that maps predictable path names to their obfuscated URLs and share this index with readers. It’s building and maintaining this index that can get problematic (and is what the linked Gaia issue is about). That’s what I was getting at when I was saying “there’s no built-in way for users to securely discover and read files from Dropbox” – you have to build and maintain the index yourself.


#18

Ah yes, that’s correct. I’m not looking at this as a Blockstack solution, though, it could be rigged to work for Blockstack too. In the Graphite solution, I would already have those share links indexed in Blockstack Default storage. So it’s an extra API call, but to get people storing data in their chosen storage provider, I think it’s worth it.


#19

True that. Also, in your case, I think this is useful regardless of whether or not users change their Gaia hubs or use a Gaia hub driver with multiple backends (especially if Graphite users want to integrate custom storage solutions).