Schema Design Guide

This document serves as a reference for creating GraphQL schemas within Udemy. It is inspired by Shopify's GraphQL Design Guide, but we have deviated from it too much to use that as our own guide. It should be used as a reference and read before Creating a New Schema Request.

A Fresh Start

The first thing we'd like to emphasize is that our GraphQL API is almost like a "new start". We are not bound by previously made design choices or legacy code. When creating your schema, you should have the following question on your mind:

Does this schema make sense to anyone without domain specific knowledge?

A good example is how we handle categories in GraphQL. The domain logic is to fetch a course's subcategories, followed by the course's categories from the subcategories. This might make sense because of our data structures, but it is not intuitive. So GraphQL defines course -> categories -> subcategories.

Domain Logic vs Business Logic

When defining your schema, we implore you to reduce domain logic as much as possible. This is especially important if your schema is exposed to partners.

We define Domain Logic as:

Rules and processes that encapsulate the core functionality of a specific business area. Usually tailored to the internal understanding of the business and might include terminology, concepts, and practices that are familiar to insiders but might be foreign to others.

Example: CLP, SDTag, UFB

We define Business Logic as:

Broad rules and operations that handle the way data is created, displayed, and modified, often in a way that's supposed to be universally understood.

Example: Course Landing Page, Udemy Business

In designing a GraphQL API, it's essential to avoid internal jargon like "CLP" and instead use terms like "Course Landing Page" that make sense to external customers. This practice fosters a more user-friendly experience, as it helps in ensuring that the API is self-descriptive and intuitive. When internal terms are used, it can create confusion and increase the learning curve for those interacting with the API who are not familiar with the internal language.

Remember (0) - Add comments to all types, queries, fields, mutations, scalars and enums. These comments should not only make sense to you and your team, but also to anyone reading the schema definition.

Internal vs External

Before you start creating your schema, you will need to decide whether or not your schema will be internal or external. External means that it will be exposed in the Developer Portal and not just Udemy clients can call it, but also 3rd parties such as Udemy Business partners.

Types

When defining types, it is a good idea to think of your schema as an Entity-Relationship model. We have entities (Course, User, Organization) and relationships between them. The relationships you define in GraphQL have to make sense from a business perspective. The underlying logic is something the GraphQL implementation is responsible for.

For example, GraphQL would not have a CourseHasUser relationship, but rather a field with enrolledUsers on a Course. The implementation would then likely use the CourseHasUser table in the resolver.

When introducing a new type, it is important to remember that they are top level within the graph. Avoid generic words like "Reminder" or "Enrollment" and instead prefix them with the business object they belong to.

Good: LearningReminder, CourseEnrollment
Bad: Item, Rating

Remember (1) - All Types Are Top Level

Queries

Queries in GraphQL are the only way to retrieve the objects you have defined. We have a few rules around queries.

Remember (2) - Query names should not sound RESTful

Instead of having queries such as getCourses, listCourses or getCourse we use the name of the actual object.

Good: course, courses, labs
Bad: getCourses, getLabs, getAllLabs

Remember (3) - All queries query by ID by default, if not it should be in the query name

All queries should take an ID or IDs by default. For example: course(id: ID), courses(ids: [ID!]!). If you query by something other than the object's ID, it should be in the query name:

reviewsByCourseId(courseId: ID)

Exception: searchCourses(query: string)

Remember (4) - Queries without ID or filter arguments run for the current authenticated subject (user, organization etc)

When you query data that is only for the current user, we do not create queries such as learningRemindersByMyUser.

Good: learningReminders():
Bad: learningRemindersForMyUser():

Pagination

We do not use Relay's spec with "Nodes" and "Edges". Instead, we generally use offset pagination, but we also have cursor pagination. This means that paginated requests have a page number and page size. In order for your query's response to be paginated, you need to implement the Paginated interface.

interface Paginated {
  "The total amount of pages. Calculated as (total result count / page size)"
  pageCount: Int!
  "The current page number, 0 based"
  page: Int!
}

The result can be of 2 types: [object]Paged or [object][verb]Response. We distinguish between these two by the following rules:

Use [object]Paged (e.g LearningProductsPaged) when you are returning a list of objects that are paged. Such as a list of courses or list of users.
Use [object][verb]Response (e.g. CourseSearchResponse) when you are returning metadata in addition to the paged results, such as filterOptions.

Mutations

The only way to mutate data through GraphQL is with mutations. You could compare a mutation with a POST request in REST.

Remember (5) - All Mutations use [noun][verb] for intuitive documentation sorting

Because GraphQL documentation is shown alphabetically, it would show all the "create" and "delete"- mutations clustered together. Instead we want them to be part of the object. It is a little counterintuitive but worth it in the long run.

Good: courseCreate, courseDelete, occupationAssign
Bad: createCourse, assignOccupation

Remember (6) - Update mutations are full updates, partial updates need to be named accordingly

We expect a full update when you create a mutation called organizationUpdate(organization: Organization!). This means every field needs to be populated with its current data or it would be overwritten. If you want a partial update, you should name your mutation accordingly such as organizationUpdateOwner(organizationId: ID!, ownerId: ID!)

Note: This goes against Shopify's guide rule #21

Scalars

We define a few custom scalars such as CourseDuration and Date. Only create a new scalar if a GraphQL primitive does not suit your needs or you need validation.

CourseDuration is always in the format of "5h 21m 30s". For example "Haha lmao" would be a valid String but of course not a valid CourseDuration. To continue this example, a scalar also allows for custom serialization and deserialization, so you could implement a scalar resolver function that would take seconds (e.g. 19290) and return a CourseDuration (e.g. "5h 21m 30s"). Date does this for unix-timestamps.

Enums

Remember one of the first paragraphs? Separating business- and domain logic is very important. Don't just copy over your protobuf ENUM values with "UNKNOWN_OPTION", instead think of what options the end user can pick from or submit.

Good: DifficultyLevel, LearningProductType

Scopes

Finally we use scopes to assess what clients can see and do. Every object you create, needs to have a permission assigned to it in permissions.ts. If you have an internal operation that can only be called by Udemy applications and is not exposed in the developer portal, add udemy:application.

Scopes video: https://learning.udemy.com/course/how-to-softare-at-udemy/learn/lecture/39471218

Error handling

We currently do not have a guide for error handling as we haven't encountered a business case for typed error handling. Read https://productionreadygraphql.com/2020-08-01-guide-to-graphql-errors for consideration on error handling.

A Fresh Start​

Domain Logic vs Business Logic​

Internal vs External​

Types​

Queries​

Pagination​

Mutations​

Scalars​

Enums​

Scopes​

Error handling​