Query API with startId, endId, and count

I recently had to write a query API for items in a service. Each item has a numerical, consecutively increasing integer as identifier. Therefore, it made sense to allow callers to specify a range. It also makes sense to specify the number of results to be returned, since we don’t want the result to get too big. The numeric identifiers actually provide a natural way of pagination.

Specifying a range using startId and endId is simple. So is startId and count. But endId and count is tricky. We can’t just start at endId - count + 1, because some of the items in that range might get filtered out because of other query settings, and then we’d return fewer than count items.

Okay, so we include all items, filter first, and then return the last count items. That works, but that creates an unnecessarily long temporary list, which may cause memory problems.

I solved this by using a NavigableMap, and when endId and count are used together, I process a reverse view of the Map using NavigableMap.descendingMap(). Then I can simply limit the number of items and after filtering reverse the list again.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
class Store {
  private final NavigableMap<Integer, CachedItem> m_cache;

  /**
   * Get a list of items.
   * @param includePredicate predicate that returns true if an item should be included in the list
   * @param reverse true to process the items in reverse order
   * @param maxCount optional maximum size of the result
   * @return list of items
   */

  public List<Item> getItems(Predicate<Item> includePredicate, boolean reverse, Optional<Integer> maxCount) {
    // NOTE: the cache currently holds all items ever created in memory... how could that possibly go wrong?

    Stream<Item> stream = (reverse ? m_cache.descendingMap() : m_cache).values().stream()
            .map(cached -> cached.copyItem())
            .filter(includePredicate);

    if (maxCount.isPresent()) {
      stream = stream.limit(maxCount.get());
    }

    return ImmutableList.copyOf(stream.collect(Collectors.toList()));
  }
}

class Controller {
  @RequestMapping(value="/items", method= RequestMethod.GET)
  public Response queryItems(@PathVariable String apiVersion,
                             @RequestParam(value = "filter", required = false) String[] filter,
                             @RequestParam(value = "start", required = false) @Min(1) Integer start,
                             @RequestParam(value = "end", required = false) @Min(1) Integer end,
                             @RequestParam(value = "count", required = false) Integer count,
                             HttpServletRequest request) {
    Response rval = new Response();

    if (start != null && end != null && count != null) {
      throw new IllegalArgumentException("Cannot use start, end, and count all together");
    }

    Predicate<Item> statusPredicate = getStatusPredicate(filter);
    Predicate<Item> idPredicate = getIdPredicate(Optional.ofNullable(start), Optional.ofNullable(end));

    // if we are using an end value and count, we have to process the list in reverse to make sure we accumulate
    // the correct number of jobs
    boolean reverse = (count != null) && (end != null);

    List<Item> items = m_asyncJobStore.getItems(statusPredicate.and(idPredicate),
                reverse, Optional.ofNullable(count));

    if (reverse) {
      items = Lists.reverse(items);
    }

    ItemsQueryType itemsQuery = new ItemsQueryType();
    itemsQuery.setItems(items);
    rval.setItemsQuery(itemsQuery);

    return rval;
  }

  @VisibleForTesting
  static Predicate<AsyncJobType> getIdPredicate(Optional<Integer> start, Optional<Integer> end) {
    Predicate<AsyncJobType> idPredicate = asyncJobType -> true;
    if (start.isPresent()) {
      idPredicate = idPredicate.and(asyncJobType -> asyncJobType.getId() >= start.get());
    }
    if (end.isPresent()) {
      idPredicate = idPredicate.and(asyncJobType -> asyncJobType.getId() <= end.get());
    }
    return idPredicate;
  }

  @VisibleForTesting
  static Predicate<AsyncJobType> getStatusPredicate(String[] filter) {
    Predicate<AsyncJobType> statusPredicate;
    if (filter == null || filter.length == 0) {
      statusPredicate = asyncJobType -> true; // include everything
    } else {
      statusPredicate = asyncJobType -> {
        boolean include = false;
        for(String status: filter) {
          if (status.equalsIgnoreCase(asyncJobType.getStatus().name())) {
            include = true;
            break;
          }
        }
        return include;
      };
    }
    return statusPredicate;
  }
}

I’m pretty happy with this solution for returning the right number of items, no matter how the range is specified.

The big problem, however, is that the class already caches everything in memory before I added the query API. Therefore, this query change could be as well-written as possible, and the cache would still blow up the heap at some point.

Unfortunately, many people don’t seem to think at that scale…

Share

About Mathias

Software development engineer. Principal developer of DrJava. Recent Ph.D. graduate from the Department of Computer Science at Rice University.
This entry was posted in Code Pranger, Programming. Bookmark the permalink.

Leave a Reply