Evernote Tech Blog

The Care and Feeding of Elephants

Inside Evernote: Dean Rzonca

What is your role on the Evernote Business team?

As a developer, I do everything – working with the product manager and designers to refine requirements for new features, helping QA form test plans, and of course lots of coding. I do full-stack development, everything from CSS animations to database schema changes.

Dean Rzonca

Evernote developer Dean Rzonca

 

Why is Evernote Business an exciting product to work on?

Evernote has been around for a while, but Business is a relatively new thing for us, so it’s exciting to be a part of something this early on. Even now, a lot of really awesome companies are using Evernote Business, so there’s a lot of impact.

 

Can you describe a few of the Evernote features you are working on now and/or you have helped build in the past?

When I started at Evernote there were only a handful of web developers, so I’ve done work in almost every part of the web service.

One of my first projects was to help get us ready to launch our service in China, which was a huge effort for almost everybody in the company. Later, I worked on Reminders in the web client, which was a lot of fun. We have really great designers here, and I enjoyed working with them to get the drag & drop interface just right.

I also did a lot of the initial work for Business, before we had a dedicated team for it. I worked with a couple of other developers on pages for administration and signup, and added support for viewing business notebooks side-by-side with personal notebooks in our web client.

What I’m working on right now, however, is top-secret.

 

What is your biggest challenge at present?

I think we have two major challenges right now.

First, our scaling model has done a great job handling millions of users who each have their own collection of notes, but there’s a lot of work to do for business users who may have access to thousands of notebooks and hundreds of thousands of notes. We need to make sure everything is fast and helps you find what you need.

Second, companies are concerned about knowing where and how information in their business can be accessed, but we also respect user’s privacy. We don’t make users keep track of two separate logins if they use Evernote at work as well as on their own, because that’s the best experience, but it’s definitely been more challenging technically. At the beginning, most Business users were already using Evernote on their own, so we tailored Business to that. But as we grow, we need to think more and more about people who are getting introduced to Evernote because they work at a company that uses it.

 

What is the most satisfying part of your job?

I get to work on a product that helps millions of people. All the time, when people hear that I work at Evernote they tell me how much they love our product and what a difference it’s made for them. We’re really focused on getting features done quickly, so almost every week there’s something new in production that I worked on.

 

What is your background?

I went to RIT and I have a degree in Software Engineering, which is basically Applied CS. We focused a lot on software architecture and process, and had some really great professors that were actively working in the industry while they were teaching. While I was there, I focused on real-time and embedded systems, but decided it wasn’t really what I wanted to be doing.

I’ve been doing full-stack web development, along with some iOS on the side.

What’s your favorite Evernote feature?

I love Skitch. It’s so much easier than trying to explain things with words. Also OCR and Document Camera. Being able to haphazardly add pictures and always find them later is sort of amazing.

 

How do you use Evernote?

We do all of our work in Evernote Business, so we have notebooks for planning and design handoff that are shared across teams. I also use it to keep a reading list of clipped articles, and plan trips with shared notebooks. I’ve been getting really into home brewing, so I keep a notebook with recipes for everything I’ve brewed, along with notes about how everything turns out. If the homebrew supply store writes down a recipe for me, all I need to do is take a photo of it.

 

If you could only use 3 adjectives to describe Evernote’s culture what would they be?

Fast, challenging, fun.

 

What is the best part about working for Evernote?

I’m surrounded by really smart, motivated people.

 

Why did you choose to work at Evernote?

I’ve been an Evernote user since it first launched, so I jumped at the chance to come work here.

Leave a comment

Using DMARC to Fight Email Spam

Since January of this year, we’ve observed spammers launching campaigns using our name. Early versions included links to pharmaceutical sites, but later versions included malicious attachments.

The spammers started by addressing these emails from legitimate, non-Evernote email addresses, but were using a visible name that said “Evernote Service” like the email below:

Screen Shot 2014-08-14 at 3.13.07 PM

It didn’t take long for the spammers to change their methods and start impersonating legitimate @evernote.com addresses in the “From” field. If you were on the spammer’s list of email targets, you started receiving emails that looked like they came from one of our email addresses. We didn’t send them, but there was no obvious way for you to know.

We want you to be confident that emails from Evernote really come from us. We have made positive steps toward ensuring this by publishing an enforcing DMARC policy. Any email sent using a @evernote.com sender address must be cryptographically signed using DKIM and originate from an IP address we publish in our SPF record.

Not all email providers support DMARC, but many large mail service providers do. When they receive an email that tries to impersonate us, they will block it before it hits your inbox.

What is DMARC (and DKIM and SPF)?

DMARC is an email delivery policy that domain owners can publish to instruct mail servers how to handle email security violations for their domain. The action can be “none, quarantine, or reject” and you can set a sampling percentage so that you can ramp up your policy gradually.

To pass a DMARC policy check, the email must first contain a valid DKIM signature. DKIM uses public key cryptography to sign the email message, which allows the receiving server to verify it. The sending mail server signs using a private key and adds that signature as a header. The receiving mail server retrieves the public key from a DNS record and verifies the signature. Next, the receiving mail server verifies that email originates from an IP address listed in that domain’s Sender Policy Framework (SPF) DNS record. If either of these fail, the DMARC check will fail and the receiving server will take the action you specified in your DMARC policy.

The road to a reject policy

We rely on a lot of service providers for business functions like customer service tickets, recruiting, marketing, discussion forums, and corporate email. Tracking all of these down and getting them compliant with DMARC took a significant amount of effort. In some cases, we were unable to get them compliant and had to change our approach and turn off impersonation or route email through a secondary service provider that would DKIM sign on our behalf.

This isn’t meant to be a detailed HOWTO, but the main steps you should follow are:

  1.  Setup your DMARC reports email accounts (rua and ruf)
  2.  Publish a DMARC record with a policy of “none”
  3.  Test each of your service providers for DKIM signing
  4.  Verify each of your service providers is listed in SPF record
  5.  Review the RUA reports to identify service providers you may have missed
  6.  Update your DMARC policy to “quarantine” with a low percentage
  7.  Slowly increase your percentage to 100%
  8.  Change your DMARC policy to “reject”

The result for us is a DMARC record that looks like the following:

$ dig txt +short _dmarc.evernote.com
“v=DMARC1\; p=reject\; pct=100\; rua=mailto:dmarc@evernote.com\; ruf=mailto:dmarc-ruf@evernote.com\; fo=0:s”

Forwarding breaks deliverability

As a part of this process, we learned that forwarding can break DKIM and SPF and not all mail service providers are doing so in a way that supports DMARC. Let’s start with DMARC and canonicalization.

We were originally signing our service emails with a DKIM canonicalization of “simple/simple”. It turns out that “simple” doesn’t mean flexible and some email services would add blank space or line breaks that would cause the signature check to fail. A mail service provider clued us into this and we switched the canonicalization to “relaxed/relaxed”. That resolved many of the failures we were seeing due to failed message body hashing.

Forwarding also breaks SPF. Let’s take the example of you registering your Evernote account with a university email account (.edu). You want to continue delivering email there, but forward to a different account. If your email provider adheres strictly to RFC 5321, they won’t rewrite the “MAIL FROM” address. Instead it preserves the return-path header as it forwards it along. The destination mail server sees the return-path is an @evernote.com address, but sees the IP address of the .edu, which isn’t in our SPF record. The destination mail server rejects the message.

To address this issue, a number of email service providers have adopted Sender Rewrite Scheme (SRS). They rewrite the return-path to their own domain, validating the SPF check, and resulting in better email deliverability. A significant number of services don’t support this and forwarded emails from our service get rejected. If you are an email service provider and do not support SRS, you should strongly consider implementing it.

Leave a comment

Import Your Fotopedia Notes to Evernote

by Tom Charles, App Reviews @ Evernote

 

fotopedia / evernote

On Sunday, August 10, all user-uploaded photos will be erased from the servers of photo encyclopedia Fotopedia as the company ceases operations. In order to save your cherished memories, we’ve built a tool to easily transfer all your files from Fotopedia to Evernote.

To import your photos, head to fotopediatoevernote.com and follow the three steps listed. In doing so, a new “Fotopedia” notebook will be created, complete with an individual note for each photo in your Fotopedia account.

It’s always sad to see a valued company shut its doors, but we hope this tool can help mitigate the effects.

Related Articles:

Leave a comment

Evernote Strengthens Privacy Position with New Security Capabilities

We believe your data is yours and should be protected.  As part of that commitment, we’ve added two new encryption capabilities that improve the security of your data when it travels across our network and the Internet.  We’ve launched inter-data center encryption, which means we are encrypting the network links that connect our US data centers and are supporting STARTTLS for secure mail delivery to your Evernote account.

Inter-Data Center Encryption
We operate two data centers in the US and transmit data between them using a dedicated network link that isn’t connected to the Internet. Because we don’t own or operate that link, we decided to take extra steps to prevent unauthorized access to data – including note content – transmitted between data centers on this network connection. As a result, in April 2014 we enabled AES encryption for all traffic flowing between our US data centers.

Email encryption in transit (STARTTLS)
We give all Evernote users a way to create notes in their account by sending emails to a unique Evernote email address. Prior to enabling STARTTLS, emails you sent to our service were transmitted unencrypted across the Internet. With STARTTLS enabled, they are encrypted in transit if the sending service supports TLS. For example, all mail sent from gmail.com and yahoo.com accounts will now be encrypted. We also support TLS for outbound emails, which means that emails you receive from our service, such as password resets, are also encrypted in transit if your mail service provider supports TLS.

These new security capabilities complement our existing HTTPS and HTTP Strict Transport Security (HSTS) support to protect your data in transit from unwanted interception. We plan to continue improving our transport security posture to support our commitment to protecting your data.

Leave a comment

Inside Evernote: Kevin Fahy

What is your role on the Web team?

I am a developer. At Evernote, this means that you’re either writing code, bouncing ideas off of fellow developers, writing ideas on the walls, discussing requirements with PMs and designers, or figuring out what you broke with QA’s help. But most of the time you are writing code with minimal distractions. The few meetings we have are always useful to us. In our weekly sprint planning meeting, for example, our product manager makes sure that you’re taking on interesting projects that you genuinely want to work on – you never feel like just another resource here.

 What product(s) do you work on?

Evernote developer Kevin Fahy

Evernote developer Kevin Fahy

On the web team you often write code that users can find in multiple products. For example, if a user wants to upgrade to premium, then some of our apps display a webview that the web team implements and styles. And of course, all of our apps communicate with our cloud API, ultimately calling backend code that we develop and maintain. You can be as full stack as you like in the web team, and you get the chance to write code that’s used in many products. Personally, I spend most of my time developing our web application, where we implement new features, optimize the caching layer, and style visual components, among other things.

 Can you describe a few of the Evernote features you are working on now and/or you have helped build in the past?

 The cool thing about Evernote is that you get to code a lot of new features. In the web team, we push a new release every week, which usually contains new features. I’ve had the good fortune of being a part of some interesting ones. I implemented reminders in our web app in a team of three – we implemented everything in two fun weeks. While not a feature per se, I’ve been refactoring our web app’s caching layer recently – it’s always satisfying to push a big commit and then observe that the system runs a bit more efficiently, or at least no worse than before! Even my first project at Evernote turned out well: the web app’s image gallery (though a lot of credit goes to our designers – they make extremely well-designed mockups and are very easy to work with).

 What is your biggest challenge at present?

 Developing at Evernote is very fast-paced – you try to write good code for new features as fast as you can, while also fixing the most important bugs in your backlog and refactoring older code. We have a lot of freedom in what we choose to code everyday, and so prioritizing properly is very difficult.

 What is the most satisfying part of your job?

 Knowing that, every week, you are changing something about how a hundred million people use Evernote is exciting, even a bit scary sometimes. But the most satisfying part is that the code you push is ultimately yours – seeing your decisions and design choices live on production gives you a weekly shot of pride.

 What is your background?

 Evernote is my first full-time job out of school. I’ve been here for almost two years now, but even after my first few months I had felt like I learned a year’s worth of skills. Prior to Evernote, I had spent too much time in school pursuing an undergraduate computing science degree, but I had the opportunity to try out product management and development as an intern at a couple of great companies.

 Who has been your biggest mentor?

 There are many people at Evernote whom I look up to and have learned from, but it’s hard to label one person as a mentor. This is because of the flat structure at Evernote – as a developer, nobody stands over you and tells you what to do; instead, it is easy for you to simply turn around and ask somebody for advice. In the same way you take ownership over your projects at Evernote, you take ownership over your own learning. From interns all the way up to our CTO, I’ve learned something from everyone.

 What’s your favorite Evernote feature?

 OCR (Optical Character Recognition) is a killer feature for me. If I take a picture or scan a document, I can trust that it’ll always be searchable by its textual contents. This, combined with the web clipper, makes it easy for me to save everything that is important to me, no matter whether it’s physical or digital.

 How do you use Evernote?

 One of the coolest things about our product is that we support many different use cases. Even internally, we don’t all use Evernote the same. Personally, I cram as much stuff as I can into my account, e.g. pictures of receipts, scanned documents, random incoherent notes, more coherent but private journal entries, custom memes (created with Skitch!). I have many notebooks and tags that I apply to “important” notes, but a fair number of my notes get added to my default, unsorted “Heap” notebook. Even though I have a grand plan to someday organize all the notes in this notebook, Evernote’s search and “related notes” features make it easy to find these unorganized notes.

 If you could only use 3 adjectives to describe Evernote’s culture what would they be?

 Inspirational – you can listen to our CEO’s vision in weekly all-hands meetings, see beautiful mockups that our designers create, review elegant code that solves hard problems – there are many sources of inspiration at Evernote.

 Fast – without good documentation, you start forgetting about how your own code works pretty quickly because you write so much of it! New features, a backlog of bugs, TODOs and FIXMEs you want to get around to… there’s always something that you want to code.

 Empowering – you are always just a few keystrokes away from changing the way a hundred million people use Evernote.

 What is the best part about working for Evernote?

 As a developer, it feels like a startup because your team is small, you make many decisions behind the code that gets pushed to production, and you naturally take a lot of ownership over your work. But at the same time, we have a CEO with an awesome vision, and the resources to quickly make it a reality. We have fantastic PMs and designers who put new features in a really good state before we start talking about them as developers. We have a heroic ops team that makes it so that developers hardly ever worry about anything besides code. And we have a thorough QA team that will always find your bugs. The best part for me is having all of these resources at my disposal, while still feeling that I can hack away at interesting problems with a bunch of friends.

 Why did you choose to work at Evernote?

 I was first attracted by Evernote’s mission and business model – there’s a level of honesty about the way Evernote treats users’ data, and they are transparent about how they collect money from users. The company was also a good size for me – after experiences with larger companies in internships, I had really wanted to try working at a startup.

Leave a comment

Stages of Denial

At 2:33pm on Tuesday afternoon (Pacific Time), an attacker began a Distributed Denial of Service (DDoS) against Evernote’s servers. At normal times, Evernote receives around 0.4 Gbps of incoming traffic and transmits out around 1.2 Gbps. We have a diverse set of pipes to the Internet that can handle several times that volume. During this attack, we experienced over 35 Gbps of incoming traffic from a network of thousands of hosts/bots. This quantity of bogus traffic exceeded the aggregate capacity of our network links, which crowded out most legitimate users.

By 3:25pm, our Operations team was able to diagnose the problem and enable a DDoS mitigation service that we had previously established with CenturyLink, one of our network providers. This required moving traffic away from our other feeds to CenturyLink via BGP and then enabling mitigation for our main IP address. Their filters were able to remove virtually all of the bogus packets and permit normal user traffic to flow again.

A few minutes later, the attackers shifted the nature of the attack to send different types of network payloads and to target other addresses that happened to share common infrastructure. This resulted in a couple hours of back-and-forth between our network engineers and CenturyLink to adapt to each attack while minimizing the impact on legitimate users (and our own incident response).

As our network links recovered, we received an unusually high volume of pent-up sync activity. This traffic was about 80% higher than we would have seen during a comparable time on another day. The extended outage had caused many of our servers to expire various cached records, so this initial stampede of client traffic led to a temporary spike in query volume on our central accounts database that was unsustainable.

This required us to perform a rolling service restart to allow individual shards time to handle their users’ pent up synchronizations and repopulate their caches without overloading our accounts database. (This problem was not directly related to the DDoS, it was just an unanticipated side effect of having an hour-long outage followed by an immediate resumption of full service without tuning our database and Couchbase cluster for that scenario.)

The ongoing network attacks and service after-effects persisted for nearly three more hours until the last components were restored to full functionality at 6:14pm.

CenturyLink’s DDoS mitigation service was able to scrub out invalid traffic to restore access, but it took a while for us to enable and configure the solution. This process was a bit haphazard because we had not yet completed our deployment configuration and testing before the incident began. Our networking team had only recently contracted for this service, and they were carefully working through a deployment plan to ensure smooth operation during a future incident. All of the procedures and runbooks were still being drafted, so we hadn’t yet determined exactly which rules would need to be applied to block an attack while permitting all legitimate traffic.

Our final days of testing and configuration were compressed down to a few hours, so the initial DDoS mitigation heuristics were not tuned for our particular application characteristics. This was successful at scrubbing out virtually all of the bogus traffic, but led to a moderate level of “false positives,” which blocked some legitimate users (and partner services like Livescribe) from connecting to Evernote.

Over the following day, we saw another wave of network attacks, which were fully mitigated. Our network engineers worked with CenturyLink to incrementally refine our filtering heuristics to reduce the number of legitimate users that were blocked. As of 4pm Wednesday, we felt that we had addressed virtually all of the incorrect blockages to restore service to the remainder of our customers.

Post-Mortem

Overall, our Operations crew handled their DDoS trial-by-fire extremely well, but we have work ahead to minimize the disruption to our users in future incidents.

The network engineers get to complete the DDoS procedures, configurations, runbooks, automation, etc. so that they can trigger the full set of mitigations in minutes rather than hours. The systems group has a set of improvements planned to make the service handle “recovery stampedes” after extended outages more gracefully. And our client teams have a couple of tickets to reduce those stampedes in the first place.

Ultimately, we know that every minute of outage for the Evernote service may prevent important tasks for thousands of our users, so we will make every effort to reduce or eliminate the impact of such attacks in the future.

6 Comments

Securing Impala for analysts

We’ve previously described the Hadoop/Hive data warehouse we built in 2012 to store and process the HTTP access logs (450M records/day) and structured application event logs (170M events/day) that are generated by our service.

This setup is still working well for us, but we added Impala into our cluster last year to speed up ad hoc analytic queries. This led to the promised 4x reduction in query times, but access to data in the cluster was basically “all or nothing” … anyone who could make queries against the cluster would have visibility into every table, row, and column within the environment.

Our engineering team works hard to make sure that our logs don’t contain sensitive data or personally identifying information, but we always want to operate under the principle of least privilege for all access into our production systems and data. (E.g. Phil Libin has no logins to our admin/support tools and no permission to crawl our HTTP access logs.) This principle means that we had to restrict Impala query privileges to a very small handful of staff who absolutely needed to go back to the primary data sources.

Recently, we spent some time trying to figure out how we could give a slightly wider group of analysts the ability to access a subset of the data stored within Impala. A few constraints and goals:

  1. The analysts access our reporting environment through a VPN that performs strong, two-factor authentication and then restricts access to a minimal whitelist of IP:port endpoints in the environment. We’d like to enable Impala queries via the smallest possible expansion of that ACL (ideally, one new TCP port on one host).
  2. The analysts do not currently have or need any shell accounts on the Debian servers that run our Hadoop cluster, and we’d really prefer not to create Linux logins for them just to permit Impala queries.
  3. They should be able to perform Impala queries using their existing desktop SQL clients. (We use Razor due to its JDBC support.)

We flailed around for a couple of weeks trying to figure out some way to do this before stumbling across a solution using a mix of Hive/Impala views, SASL authentication to a local DB file, and user/group/role definitions via a Sentry policy file.

Hive/Impala views

Many databases rely on views to provide variable levels of access to data stored within tables. Access to a full table may be restricted, but you can create views giving access to a subset of the rows and/or columns in that table, and permit a different set of consumers to access those views.

Here’s an example table in Hive that contains a hypothetical sequence of events. Each event has an IP address, country code, client identifier, and an “action” that was performed by that client. The full ‘events’ table is in the database named ‘sensitive’, and we create two views in the ‘filtered’ database to give restricted access onto that table:

$ cat /tmp/events.csv
10.1.2.3,US,android,createNote
10.200.88.99,FR,windows,updateNote
10.1.2.3,US,android,updateNote
10.200.88.77,FR,ios,createNote
10.1.4.5,US,windows,updateTag

$ hive -S
hive> create database sensitive;
hive> create table sensitive.events (
    ip STRING, country STRING, client STRING, action STRING
  ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
hive> load data local inpath '/tmp/events.csv' overwrite into table sensitive.events;
hive> create database filtered;
hive> create view filtered.events as select country, client, action from sensitive.events;
hive> create view filtered.events_usonly as
  select * from filtered.events where country = 'US';

The ‘filtered.events’ view gives access to all rows, but removes access to the IP address column. The ‘filtered.events_usonly’ view further restricts access to only rows that have a country of ‘US’.

These views work great, but now we need to tell Impala to restrict groups of people to only access the correct views.

SASL username/password database

Impala’s daemon officially supports two mechanisms for authentication: kerberos and LDAP. We don’t particularly want to set up a Kerberos infrastructure, and it’s not clear how that would work for people who are just connecting to the Impala daemon over TCP from a SQL tool on their laptops.

We do have LDAP, but the current support in Impala is extremely preliminary (“only tested against Active Directory“) and we couldn’t get it to work against our TLS-only OpenLDAP infrastructure with the limited set of configuration options available today.

While flailing around with LDAP, we decided to try the undocumented –ldap_manual_config option. It turns out that if you tell Impala that it should perform authentication with LDAP (–enable_ldap_auth) using this “manual configuration” option, that means “don’t use LDAP at all, just try to match the client’s username+password against a BerkeleyDB file sitting at /etc/sasldb2.”

We created that file using the ‘saslpassd2‘ command to enter each username and password on our desired impala-server host. As an example, the following shows three different accounts being created in the sasldb2 file:

# saslpasswd2 sysadmin1
Password:
Again (for verification):
# saslpasswd2 analyst1
...
# saslpasswd2 analyst2
...

# ls -al /etc/sasldb2
-rw-r----- 1 root sasl 12288 Jun  4 17:26 /etc/sasldb2
# usermod -a -G sasl impala

(These usernames do not correspond with any shell accounts in /etc/passwd … they are a standalone authentication database.)

Sentry policy file

To specify the set of permissions for various groups of users, we need to tell Impala to use a Sentry policy file in HDFS. This file contains sections for mapping users into groups and groups onto roles. Roles specify which operations can be performed against which objects in Impala. Here we show our three example SASL users mapped into groups that can either perform any Impala query, perform SELECT operations against any of our ‘filtered’ views, or only SELECT from the ‘events_usonly’ view:

$ cat /tmp/impala-policy.ini
[groups]
sysadmins = any_operation
global_analysts = select_filtered
us_analysts = select_us
[roles]
any_operation = server=testimpala->db=*->table=*->action=*
select_filtered = server=testimpala->db=filtered->table=*->action=SELECT
select_us = server=testimpala->db=filtered->table=events_usonly->action=SELECT
[users]
sysadmin1 = sysadmins
analyst1 = global_analysts
analyst2 = us_analysts

$ hdfs dfs -put /tmp/impala-policy.ini /user/hive/warehouse/

Impala server arguments

Finally, we need to tell Impala’s daemon to use the SASL database for authentication and the Sentry policy file for authorization by adding the following arguments to IMPALA_SERVER_ARGS in /etc/default/impala:

-server_name=testimpala \
-authorization_policy_provider_class=org.apache.sentry.provider.file.LocalGroupResourceAuthorizationProvider \
-authorization_policy_file=/user/hive/warehouse/impala-policy.ini \
--enable_ldap_auth=true \
--ldap_manual_config=true \

Then restart the Impala daemons on that host and confirm that there are no errors in /var/log/impala/*

Testing …

Using the impala-shell command-line query tool, we can now confirm that the ‘sysadmin1′ user can query the sensitive source table:

$ impala-shell --quiet -l -u sysadmin1
LDAP password for sysadmin1:
[debian-virtualbox.rwc.etonreve.com:21000] > select * from sensitive.events;
+--------------+---------+---------+------------+
| ip           | country | client  | action     |
+--------------+---------+---------+------------+
| 10.1.2.3     | US      | android | createNote |
| 10.200.88.99 | FR      | windows | updateNote |
| 10.1.2.3     | US      | android | updateNote |
| 10.200.88.77 | FR      | ios     | createNote |
| 10.1.4.5     | US      | windows | updateTag  |
+--------------+---------+---------+------------+

The first analyst can’t query that table, but can use the ‘filtered.events’ view to see everything but the IP addresses:

$ impala-shell --quiet -l -u analyst1
LDAP password for analyst1:
[testimpala:21000] > select * from sensitive.events;
ERROR: AuthorizationException: User 'analyst1' does not have privileges to
execute 'SELECT' on: sensitive.events
[testimpala:21000] > select * from filtered.events;
+---------+---------+------------+
| country | client  | action     |
+---------+---------+------------+
| US      | android | createNote |
| FR      | windows | updateNote |
| US      | android | updateNote |
| FR      | ios     | createNote |
| US      | windows | updateTag  |
+---------+---------+------------+

And the second analyst can only see the US events:

[testimpala:21000] > $ impala-shell --quiet -l -u analyst2
LDAP password for analyst2:
[testimpala:21000] > select * from filtered.events;
ERROR: AuthorizationException: User 'analyst2' does not have privileges to
execute 'SELECT' on: filtered.events
[testimpala:21000] > select * from filtered.events_usonly;
+---------+---------+------------+
| country | client  | action     |
+---------+---------+------------+
| US      | android | createNote |
| US      | android | updateNote |
| US      | windows | updateTag  |
+---------+---------+------------+
[testimpala:21000] > select client, count(*) as c from filtered.events_usonly group by 1;
+---------+---+
| client  | c |
+---------+---+
| android | 2 |
| windows | 1 |
+---------+---+

We also use this via the Impala/Hive driver from Razor and JasperServer via the JDBC URL (jdbc:hive2://testimpala:21050/).

Futures…

The ‘sasldb2′ file is not a perfect long-term solution. There’s no UI for self-management of passwords by our analysts, so a sysadmin will need to help them every time they want to change their password. The flat file representation relies on root security on the box, so obviously wouldn’t be appropriate in otherwise-poorly-secured environments.

We’re sure that the LDAP capabilities will improve over time, although I’d be a bit nervous about using LDAP passwords for database connectivity, since desktop tools with SQL integrations would tend to manage and store passwords insecurely.

The same applies for the Sentry policy file. Manually loading this into HDFS whenever we add a user is manageable for now, but not a long-term solution. We could reduce the churn by creating OS-level accounts in OS-level groups and leveraging those, but that’s replacing one clunky group management solution with another.

We couldn’t figure out how to get Hue to talk to Impala properly with SASL+Sentry enabled, so we currently have Hue/Impala configured to talk to the Impala daemon on one of our data nodes, which does not have this enabled. (We’re using network-level ACLs for isolation for the time being.)

But, overall, this solution will meet our needs for a year or two while we’re still dealing with access from only a small number of analysts.

2 Comments

Streamlining The Evernote Cloud SDK for iOS

evernote-ios-sdk-2

In this post we provide an overview of the rebuilt and streamlined Evernote Cloud SDK for iOS. You can try it out for yourself at: github.com/evernote/evernote-cloud-sdk-ios

We’ve always been proud to offer the full sophistication of the Evernote platform to all developers to build on. You have access to the same complete API that our own client apps are built on, so you can do anything we can.  What that has too often meant, though, is that you’ve had to understand a lot of that sophistication before accomplishing basic tasks. The more complex technology— for example shared notebooks and Evernote Business notebooks — has been challenging to support without expertise and a fair bit of code.

We’re starting to change that. The developer team here has been working hard for the last few months rethinking our client SDK experience, simplifying and streamlining common workflows. We’re debuting this work with an all-new iOS SDK, which is now live as a new beta version here: github.com/evernote. Want to just turn some text or an image into a note? Maybe just offer a one-stop Save To Evernote button? We’ve made these things really easy. Also simple support for searching for and downloading notes and thumbnails. We expect these features —plus some other goodies outlined below— will cover the vast majority of the ways apps want to integrate with Evernote. For anything not covered, the full API is always available.

Here are some of the notable improvements you’ll find in the new Evernote Cloud SDK for iOS:

  • Professionally-designed “share sheet” UI for “Save To Evernote”. This is the simplest integration and requires almost no work on your part. Includes all source so you can modify or learn.
  • Easy-to-use functions for creation, updating, sharing, search, and download of notes.
  • One-liner attachment of images or other resources to notes.
  • Ability to capture basic HTML content from a web view and generate a note from it.
  • Automatic support for shared and business notebooks. This is completely transparent to the developer.
  • Automatic support for the upcoming “App Notebook” style of authorization.
  • Simple conversion of downloaded notes into data prepared for display in a UIWebView.
  • Significantly reduced code and binary SDK size.
  • An “advanced” set of headers that let you access the full traditional (EDAM) API directly if you need it for more complicated tasks.
  • A build script that will create a bundled framework that you can drop into your project.

We believe that Evernote and your app are better with each other. We also believe that you have many important things to spend your limited development time on. The new Evernote Cloud SDK for iOS is a step towards quicker, reliable integrations that result in better experiences for your users. Give the new SDK a spin, and let us know how it works for you!

ben zottoAbout Ben

Ben Zotto is a product and engineering lead at Evernote. He created the Penultimate handwriting app for iPad, and is currently working on the redesigned Evernote SDK for iOS, among other projects.

Ben on Stack Overflow

Leave a comment

[RSVP] Evernote Dev Party @ WWDC

ios-meetup-wwdc-2014

The Evernote Dev team is hosting our 3rd annual iOS meet up the first night of Apple’s Developer Conference, WWDC in San Francisco. We’ll be right across the street at the Thirsty Bear Pub!

Tickets are limited so RSVP while the tickets are still available: Eventbrite

We look forward to connecting in person, answering questions about our API, the Evernote Platform Awards, and the Evernote App Center.

Details:
Monday, June 2nd, 2014
6pm – 9pm
Thirsty Bear Brewing Co.
661 Howard St, San Francisco
California 94105
eventbrite.com/e/evernote-dev-meetup-wwdc-tickets-11766304333

New iOS SDK
We are revealing next week a brand new SDK for building apps using Evernote SDK for iOS. We provide new tools for quickly passing content into Evernote with ease. You can try out the new SDK here: github.com/evernote/evernote-cloud-sdk-ios

Leave a comment

Feedback Needed for New API Feature: App Notebooks

app-notebooks-614

We’re pleased to announce the beta release for our most-requested API feature: Single Notebook Authorization — We’re calling this feature “App Notebooks”. Beginning this summer, third-party developers can connect their application exclusively to a single notebook within a given user’s Evernote account.

Read the new documentation here: App Notebooks API docs

This benefits both our users and developers:

  • Users can choose the scope for which notebook an app can read
  • Developers choose whether their application fits in this model
  • User data is kept secure for business and personal notebooks
  • Server-side optimizations allow for faster API access

How does App Notebook work?

When a user is prompted to authorize an application to access his/her Evernote account, the user will have a few options:

  • Create a new notebook for the application to use; this is the default behavior; the notebook will have the same name as the application
  • Create a new notebook with a custom name
  • Select an existing notebook from their account to be used by the application

The good news is, App Notebooks will require very little in terms of code modifications — most of the magic happens on the server during the OAuth process.

We need your help!

This feature is about to enter private beta testing and we’re looking for a small group of partners to help us work out the kinks. If you’re interested, please let us know by filling out this form. All partners chosen for the beta will receive a complimentary year of Evernote Premium.

FAQs

“What if my app needs access to a user’s entire Evernote account to function properly?”

During the API creation process, you’ll have the opportunity to request full account access for your API key.

“What about existing API keys?”

All existing API keys will retain their current permission level.

If you have any questions, please post them in the comments below or in our developer forum and we’ll answer them as quickly as possible. Also, documentation for this feature will be provided to testers during the beta period and published to the Evernote Developer site when App Notebooks go live this summer.

Example Flow

auth-app

change-notebook

 

notebook-chooser

Leave a comment