当前位置：首页 > news >正文

Sentinel源码—5.FlowSlot借鉴Guava的限流算法一

news 2025/7/8 16:27:32

大纲

1.Guava提供的RateLimiter限流使用示例

2.Guava提供的RateLimiter简介与设计

3.继承RateLimiter的SmoothBursty源码

4.继承RateLimiter的SmoothWarmingUp源码

1.Guava提供的RateLimiter限流使用示例

(1)拦截器示例

(2)AOP切面示例

(1)拦截器示例

一.pom文件中引入Guava的依赖包

<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency><groupId>com.google.guava</groupId><artifactId>guava</artifactId><version>27.0.1-jre</version>
</dependency>

二.自定义拦截器并在拦截器中实现限流

首先定义一个拦截器抽象类，用于多个拦截器复用。主要是继承HandlerInterceptorAdapter，重写preHandle()方法，并且提供preFilter()抽象方法供子类实现。

public abstract class AbstractInterceptor extends HandlerInterceptorAdapter {private Logger logger = LoggerFactory.getLogger(AbstractInterceptor.class);@Overridepublic boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {ResponseEnum result;try {result = preFilter(request);} catch (Exception e) {logger.error("preHandle catch a exception:" + e.getMessage());result = ResponseEnum.FAIL;}if (ResponseEnum.SUCCESS.code.equals(result.code)) {return true;}handlerResponse(result, response);return false;}//自定义pre处理protected abstract ResponseEnum preFilter(HttpServletRequest request);//错误处理事件private void handlerResponse(ResponseEnum result, HttpServletResponse response) {ResponseDto responseDto = new ResponseDto();responseDto.setCode(result.code);responseDto.setStatus(result.status);responseDto.setMessage(result.message);response.setStatus(HttpServletResponse.SC_OK);response.setContentType(MediaType.APPLICATION_JSON_UTF8_VALUE);PrintWriter printWriter = null;try {printWriter = response.getWriter();printWriter.write(JsonUtils.toJson(responseDto));} catch (Exception e) {logger.error("handlerResponse catch a exception:" + e.getMessage());} finally {if (printWriter != null) {printWriter.close();}}}
}

然后定义流量控制拦截器，流量控制拦截器继承自上面的拦截器抽象类，并在preFilter()方法中进行流量控制。

使用Guava提供的RateLimiter类来实现流量控制，过程很简单：定义一个QPS为1的全局限流器，使用tryAcquire()方法来尝试获取令牌。如果成功则返回允许通过，否则返回限流提示。

@Component("rateLimitInterceptor")
public class RateLimitInterceptor extends AbstractInterceptor {private Logger logger = LoggerFactory.getLogger(RateLimitInterceptor.class);//定义一个QPS为1的全局限流器private static final RateLimiter rateLimiter = RateLimiter.create(1);public static void setRate(double limiterQPS){rateLimiter.setRate(limiterQPS);}@Overrideprotected ResponseEnum preFilter(HttpServletRequest request) {if (!rateLimiter.tryAcquire()) {logger.warn("限流中......");return ResponseEnum.RATE_LIMIT;}return ResponseEnum.SUCCESS;}
}

三.继承WebMvcConfigurerAdapter来添加自定义拦截器

@Configuration
public class MyWebAppConfigurer extends WebMvcConfigurationSupport {@Overridepublic void addInterceptors(InterceptorRegistry registry) {//多个拦截器组成一个拦截器链//addPathPatterns()方法用于添加拦截规则//excludePathPatterns()方法用户排除拦截registry.addInterceptor(new RateLimitInterceptor()).addPathPatterns("/**");super.addInterceptors(registry);}
}

四.写一个Controller来提供一个简单的访问接口

@RestController
public class GuavaController {@RequestMapping(value = "getUserList", method = RequestMethod.GET)public String getUserList() {String result = null;try {result = "请求成功";} catch (Exception e) {logger.error("请求失败", e);return JsonUtils.toJson(ResponseUtils.failInServer(result));}return JsonUtils.toJson(ResponseUtils.success(result));}
}

(2)AOP切面示例

一.编写自定义注解Limiter

当需要对接口限流时，可直接使用@Limiter注解。

@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface Limiter {//默认每秒放入桶中的tokendouble limitNum()default 5;Stringname()default "";
}

二.编写切面处理逻辑

@Aspect
@Component
public class RateLimitAspect {private ConcurrentHashMap RATE_LIMITER = new ConcurrentHashMap<>();private RateLimiter rateLimiter;@Pointcut("@annotation(添加Limiter注解所在类路径)")public void serviceLimit() {}@Around("serviceLimit()")public Objectaround(ProceedingJoinPoint point)throws Throwable {//获取拦截的方法名Signature sig = point.getSignature();//获取拦截的方法名MethodSignature msig = (MethodSignature) sig;//返回被织入增加处理目标对象Object target = point.getTarget();//为了获取注解信息Method currentMethod = target.getClass().getMethod(msig.getName(), msig.getParameterTypes());//获取注解信息Limiter annotation = currentMethod.getAnnotation(Limiter.class);//获取注解每秒加入桶中的tokendouble limitNum = annotation.limitNum();//注解所在方法名区分不同的限流策略String methodName = msig.getName();if (RATE_LIMITER.containsKey(methodName)) {rateLimiter = RATE_LIMITER.get(methodName);} else {RATE_LIMITER.put(methodName, RateLimiter.create(limitNum));rateLimiter = RATE_LIMITER.get(methodName);}if (rateLimiter.tryAcquire()) {log.info("流量正常范围内");return point.proceed();} else {log.info("您被限流了");}}
}

三.业务代码中添加限流控制

注意：@Limiter注解中的limitNum参数表示每秒接口最大调用次数，而name表示限流名称，整个工程中需要保证全局唯一。

@RequestMapping("/test/path")
@RestController
public class LimiterController {@PostMapping("/do")@Limiter(limitNum = 30, name = "test_name")public Result vote(@RequestBody @Validated TestRequest request) {//编写业务逻辑代码return nulll;}
}

2.Guava提供的RateLimiter简介与设计

(1)RateLimiter的简介

(2)RateLimiter通过延迟计算的方式实现限流

(3)SmoothRateLimiter的注释说明

(4)SmoothWarmingUp的注释说明

(1)RateLimiter的介绍

RateLimiter是一个抽象类，从它的注释可知：

一.限流器会以固定配置的速率来分配令牌。每次调用acquire()方法，如果令牌不够则会阻塞，直到有足够的令牌。

二.RateLimiter是线程安全的。它会限制所有线程的总速率，但是它并不能保证公平。

三.acquire(1)和acquire(100)所产生的影响是一样的。它们都不会影响第一次调用这个方法的请求，影响的是后面的请求。

RateLimiter类本身是一个抽象类，子类SmoothRateLimiter又做了一层抽象。SmoothRateLimiter有两个子类SmoothBursty和SmoothWarmingUp，可以说SmoothWarmingUp是SmoothBursty的升级版，SmoothWarmingUp是为了弥补SmoothBursty的不足而实现的。

所以RateLimiter有两个具体的继承类：SmoothWarmingUp和SmoothBursty。SmoothWarmingUp和SmoothBursty都是SmoothRateLimiter的内部类。分别对应两种限流方式：一是有预热时间，一是没有预热时间。

//Conceptually：在概念上；distributes：分发、分配；permits：许可、令牌；configurable：可配置的
//restrict：限制；in contrast to：与...相反、相比之下
//Absent additional configuration：缺少额外配置；individual：单独的；maintained：维持的
//steadily：稳定地；smoothly：平稳地、平滑地；accomplished：完成；specifying：明确规定
//throttling、throttle：阻碍、抑制、使节流；
//i.e.：也就是//Conceptually, a rate limiter distributes permits at a configurable rate.
//Each acquire() blocks if necessary until a permit is available, and then takes it.
//Once acquired, permits need not be released.//从概念上讲，速率限制器RateLimiter会以可配置的速率分配令牌；
//如果需要，每个acquired()方法都会阻塞，直到一个令牌可用，然后再获取到一个令牌；
//一旦获得令牌，就不需要发放了；//RateLimiter is safe for concurrent use: 
//It will restrict the total rate of calls from all threads. 
//Note, however, that it does not guarantee fairness.//并发使用RateLimiter时是安全的；
//它将限制来自所有线程的调用的总速率；
//然而，它并不能保证公平；//Rate limiters are often used to restrict the rate at which some physical or logical resource is accessed.
//This is in contrast to JDK's Semaphore which restricts the number of concurrent accesses instead of the rate.//速率限制器RateLimiter通常用于限制访问某些物理或逻辑资源的速率；
//这与JDK的Semaphore形成对比，后者限制并发访问的数量而不是速率；//A RateLimiter is defined primarily by the rate at which permits are issued. 
//Absent additional configuration, permits will be distributed at a fixed rate, defined in terms of permits per second.
//Permits will be distributed smoothly, 
//with the delay between individual permits being adjusted to ensure that the configured rate is maintained.//一个RateLimiter主要由发放令牌的速率来定义；
//如果没有额外的配置，令牌将以固定的速率分发，以每秒多少个令牌的形式定义；
//通过调整各个令牌之间的延迟来确保维持所配置的速率，来实现令牌被平滑地发放；//It is possible to configure a RateLimiter to have a warmup period during which time
//the permits issued each second steadily increases until it hits the stable rate.
//可以将RateLimiter配置为具有预热期，在此期间每秒发放的令牌会稳步增加，直到达到稳定的速率；//As an example, imagine that we have a list of tasks to execute, 
//but we don't want to submit more than 2 per second:
//举个例子，假设我们有一个要执行的任务列表，但我们不希望每秒超过2个提交：
//    final RateLimiter rateLimiter = RateLimiter.create(2.0); // rate is "2 permits per second"
//    void submitTasks(List<Runnable> tasks, Executor executor) {
//        for (Runnable task : tasks) {
//            rateLimiter.acquire(); // may wait
//            executor.execute(task);
//        }
//    }//As another example, imagine that we produce a stream of data, and we want to cap it at 5kb per second.
//This could be accomplished by requiring a permit per byte, and specifying a rate of 5000 permits per second:
//作为另一个例子，假设我们产生一个数据流，我们希望将其限制在每秒5kb；
//这可以通过要求每个字节有一个令牌，并指定每秒5000个令牌的速率来实现：
//    final RateLimiter rateLimiter = RateLimiter.create(5000.0); // rate = 5000 permits per second
//    void submitPacket(byte[] packet) {
//        rateLimiter.acquire(packet.length);
//        networkService.send(packet);
//    }//It is important to note that the number of permits requested never affects the throttling of the request itself 
//(an invocation to acquire(1) and an invocation to acquire(1000) will result in exactly the same throttling, if any), 
//but it affects the throttling of the next request. 
//I.e., if an expensive task arrives at an idle RateLimiter, it will be granted immediately, 
//but it is the next request that will experience extra throttling,
//thus paying for the cost of the expensive task.//需要注意的是，请求的令牌数量永远不会影响请求本身的限流；
//调用acquire(1)和调用acquire(1000)将导致完全相同的限流(如果有的话)；
//但是它会影响下一个请求的限流；
//也就是说，如果一个昂贵的任务到达空闲的RateLimiter，它将会被立即允许；
//但是下一个请求将经历额外的限流，从而支付了昂贵的限流成本；@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime") // lots of violations - also how should we model a rate?
public abstract class RateLimiter {......
}

(2)RateLimiter通过延迟计算的方式实现限流

令牌桶算法就是以固定速率生成令牌放入桶中。每个请求都需要从桶中获取令牌，没有获取到令牌的请求会被阻塞限流。当令牌消耗速度小于生成速度时，令牌桶内就会预存这些未消耗的令牌。当有突发流量进来时，可以直接从桶中取出令牌，而不会被限流。

漏桶算法就是将请求放入桶中，然后以固定的速率从桶中取出请求来处理。当桶中等待的请求数超过桶的容量后，后续的请求就不再加入桶中。

漏桶算法适用于需要以固定速率处理请求的场景。在多数业务场景中，其实并不需要按照严格的速率进行请求处理。而且多数业务场景都需要应对突发流量的能力，所以会使用令牌桶算法。

但不管是令牌桶算法还是漏桶算法，都可以通过延迟计算的方式来实现。延迟计算指的是不需要单独的线程来定时生成令牌或者从漏桶中定时获取请求，而是由调用限流器的线程自己计算是否有足够的令牌以及需要sleep的时间。延迟计算的方式可以节省一个线程资源。

Guava提供的RateLimiter就是通过延迟计算的方式来实现限流效果的。

(3)SmoothRateLimiter的注释说明

//How is the RateLimiter designed, and why?//The primary feature of a RateLimiter is its "stable rate", 
//the maximum rate that is should allow at normal conditions. 
//This is enforced by "throttling" incoming requests as needed, 
//i.e. compute, for an incoming request, the appropriate throttle time, 
//and make the calling thread wait as much.//RateLimiter的主要特点是其"稳定的速率"——即在正常条件下被允许的最大速率；
//这是通过根据需要，强制限制"到达的请求"来实现的；
//也就是说，对于一个到达的请求，计算一个合适的限流时间，然后让调用线程等待同样多的时间；//The simplest way to maintain a rate of QPS is to keep the timestamp of the last granted request, 
//and ensure that (1/QPS) seconds have elapsed since then. 
//For example, for a rate of QPS=5 (5 tokens per second), 
//if we ensure that a request isn't granted earlier than 200ms after the last one, then we achieve the intended rate. 
//If a request comes and the last request was granted only 100ms ago, then we wait for another 100ms. 
//At this rate, serving 15 fresh permits (i.e. for an acquire(15) request) naturally takes 3 seconds.//保持QPS速率的最简单方法是保留最后一个允许请求的时间戳，并确保从那时起过了(1/QPS)秒之后再放行另外一个请求；
//例如：对于QPS=5的速率(每秒五个令牌，200ms一个)
//如果我们能保证一个请求被允许通过的时间点比上次放行的请求的时间点之差不小于200ms，那么我们就算保证了这个QPS=5的速率；
//如果一个请求到达时，上次放行的请求才过了100ms，那么当前这个请求就得再等待100ms；
//按照这个速率，如果调用acquire(15)，就是想得到15个新鲜的令牌，那么就需要花费3秒的时间；//It is important to realize that such a RateLimiter has a very superficial memory of the past: it only remembers the last request.
//What if the RateLimiter was unused for a long period of time, then a request arrived and was immediately granted? 
//This RateLimiter would immediately forget about that past underutilization.
//This may result in either underutilization or overflow, 
//depending on the real world consequences of not using the expected rate.//重要的是需要认识到，这样一个RateLimiter对于过去的请求是不怎么记忆的，唯一记忆的是上次的请求；
//如果RateLimiter在很长一段时间内未使用，那么一个请求到达并立即被批准，该怎么办？
//RateLimiter会立刻忘记它之前是处于一个未被充分利用的情况("过去未充分利用")；
//这可能导致利用不足或溢出，具体取决于未使用预期速率的实际后果；//Past underutilization could mean that excess resources are available. 
//Then, the RateLimiter should speed up for a while, to take advantage of these resources. 
//This is important when the rate is applied to networking (limiting bandwidth), 
//where past underutilization typically translates to "almost empty buffers", which can be filled immediately.//"过去未充分利用"可能意味着资源过剩；RateLimiter应该要加快速度来充分利用资源；
//当这个速率代表着宽带限制的时候，"过去未充分利用"这种状况通常意味着"几乎是空的缓存区"，它可以被瞬间填充满；//On the other hand, past underutilization could mean that 
//"the server responsible for handling the request has become less ready for future requests", 
//i.e. its caches become stale, and requests become more likely to trigger expensive operations 
//(a more extreme case of this example is when a server has just booted, 
//and it is mostly busy with getting itself up to speed).//另一方面，"过去未充分利用"也可能意味着：负责处理请求的服务器对未来的请求准备不足；
//例如：它的缓存失效了，服务端需要花费更多的时间来处理请求；
//一个更极端的情况就是是服务端刚刚启动，它忙于让自己跟上速度；//To deal with such scenarios, we add an extra dimension, 
//that of "past underutilization", modeled by "storedPermits" variable. 
//This variable is zero when there is no underutilization, 
//and it can grow up to maxStoredPermits, for sufficiently large underutilization. 
//So, the requested permits, by an invocation acquire(permits), are served from:
// - stored permits (if available)
// - fresh permits (for any remaining permits)//为了处理这种情况，我们增加了一个额外的维度："过去未充分利用"使用storedPermits变量来表示；
//当不存在未充分利用时，该变量为零；并且它可以增长到maxStoredPermits，以获得足够大的未充分利用；
//因此acquire(permits)得到的令牌由两部分组成：存储的令牌、新鲜的令牌//How this works is best explained with an example:
//For a RateLimiter that produces 1 token per second, 
//every second that goes by with the RateLimiter being unused, we increase storedPermits by 1. 
//Say we leave the RateLimiter unused for 10 seconds 
//(i.e., we expected a request at time X, but we are at time X + 10 seconds before a request actually arrives; 
//this is also related to the point made in the last paragraph), 
//thus storedPermits becomes 10.0 (assuming maxStoredPermits >= 10.0). 
//At that point, a request of acquire(3) arrives. 
//We serve this request out of storedPermits, 
//and reduce that to 7.0 (how this is translated to throttling time is discussed later). 
//Immediately after, assume that an acquire(10) request arriving. 
//We serve the request partly from storedPermits, using all the remaining 7.0 permits, 
//and the remaining 3.0, we serve them by fresh permits produced by the rate limiter.//这是如何工作的呢，最好用一个例子来解释：
//对于每秒生成1个令牌的RateLimiter，
//在RateLimiter未使用的情况下，每过一秒，我们就会将storedPermits增加1；
//假设我们让RateLimiter闲置10秒钟(即，我们期望在时间点X收到请求，但是在时间点X+10秒时才收到请求；这也与上一段中提出的观点有关)
//那么storedPermits就会变成10.0(假设maxStoredPermits >= 10.0)
//在这种情况下，有一个acquire(3)的请求到达；
//我们会从storedPermits中取出令牌来服务这个请求，并将storedPermits减少到7.0；
//刚处理完这个请求，马上有个acquire(10)的请求到来，
//我们会继续从storedPermits中取出剩下的7个，其它3个从freshPermits中取出；//We already know how much time it takes to serve 3 fresh permits: 
//if the rate is "1 token per second", then this will take 3 seconds. 
//But what does it mean to serve 7 stored permits? 
//As explained above, there is no unique answer. 
//If we are primarily interested to deal with underutilization, 
//then we want stored permits to be given out faster than fresh ones,
//because underutilization = free resources for the taking. 
//If we are primarily interested to deal with overflow, 
//then stored permits could be given out slower than fresh ones. 
//Thus, we require a (different in each case) function that translates storedPermits to throttling time.//我们已经知道提供3个新鲜的令牌需要多长时间：如果速率是"每秒1个令牌"，那么这将需要3秒；
//但是，提供7个存储的令牌意味着什么？如上所述，这没有唯一的答案；
//如果我们主要感兴趣的是处理未充分利用的问题，那么可以让存储的令牌比新鲜的令牌发放得更快，因为未充分利用 = 存在可供占用的空闲资源；
//如果我们主要感兴趣的是处理溢出的问题，那么可以让存储的令牌比新鲜的令牌发放得更慢；
//因此，我们需要一个(在每种情况下都不同)函数将storedPermits转换为限流时间;//This role is played by storedPermitsToWaitTime(double storedPermits, double permitsToTake). 
//The underlying model is a continuous function mapping storedPermits (from 0.0 to maxStoredPermits)
//onto the 1/rate (i.e. intervals) that is effective at the given storedPermits. 
//"storedPermits" essentially measure unused time; 
//we spend unused time buying/storing permits. 
//Rate is "permits / time", thus "1 / rate = time / permits". 
//Thus, "1/rate" (time / permits) times "permits" gives time, 
//i.e., integrals on this function (which is what storedPermitsToWaitTime() computes) 
//correspond to minimum intervals between subsequent requests, 
//for the specified number of requested permits.//这个角色由函数storedPermitsToWaitTime(double storedPermits，double permitsToTake)来扮演；
//该函数的底层模型是一个映射"storedPermits"到"1/rate"的连续函数；
//其中"storedPermits"主要衡量RateLimiter未被使用的时间，我们会在这段未使用的时间内存储令牌；
//而"rate"(速率)就是"申请的令牌/时间"，即"rate" = "permits/time"；
//因此，"1/rate" = "time/permits"，表示每个令牌需要的时间；
//因此，"1/rate"(time/permits)乘以"permits"等于给定的时间time；
//也就是说，此连续函数上的积分(就是storedPermitsToWaitTime()计算的值)，
//对应于随后的(申请指定令牌permits的)请求之间的最小时间间隔；//Here is an example of storedPermitsToWaitTime: 
//If storedPermits == 10.0, and we want 3 permits,
//we take them from storedPermits, reducing them to 7.0, 
//and compute the throttling for these as a call to storedPermitsToWaitTime(storedPermits = 10.0, permitsToTake = 3.0), 
//which will evaluate the integral of the function from 7.0 to 10.0.//下面是一个关于storedPermitsToWaitTime()函数的例子：
//如果storedPermits==10.0，并且我们想要3个令牌；
//那么我们会从存储的令牌中获取，将其降低到7.0；
//而且会调用函数storedPermitsToWaitTime(storedPermits = 10.0, permitsToTake = 3.0)计算出限流信息，
//这可以评估出连续函数从7.0到10.0的积分；
//注意：积分的结果是请求间的最小间隔；//Using integrals guarantees that the effect of a single acquire(3) is equivalent to 
//{ acquire(1); acquire(1); acquire(1); }, or { acquire(2); acquire(1); }, etc, 
//since the integral of the function in [7.0, 10.0] is equivalent to 
//the sum of the integrals of [7.0, 8.0], [8.0, 9.0], [9.0, 10.0] (and so on), 
//no matter what the function is. 
//This guarantees that we handle correctly requests of varying weight (permits), 
//no matter what the actual function is - so we can tweak the latter freely. 
//(The only requirement, obviously, is that we can compute its integrals).//使用积分可以保证acquire(3)得到的结果和{ acquire(1); acquire(1); acquire(1); }或者{ acquire(2); acquire(1); }的结果是一样的，
//由于不管连续函数是什么，[7.0，10.0]中的积分等于[7.0，8.0]，[8.0，9.0]，[9.0，10.0]的积分之和(依此类推)；
//这保证了我们能正确处理不同令牌申请的请求；
//显然，我们唯一需要做的就是计算该连续函数的积分；//Note well that if, for this function, we chose a horizontal line, at height of exactly (1/QPS),
//then the effect of the function is non-existent: 
//we serve storedPermits at exactly the same cost as fresh ones (1/QPS is the cost for each). 
//We use this trick later.//注意，如果对于这个函数，我们选择了一条水平线，高度恰好为(1/QPS)，则函数的效果不存在：
//我们以与新鲜的令牌完全相同的成本提供储存的令牌(1/QPS是每个令牌的成本)；//If we pick a function that goes below that horizontal line, 
//it means that we reduce the area of the function, thus time. 
//Thus, the RateLimiter becomes faster after a period of underutilization. 
//If, on the other hand, we pick a function that goes above that horizontal line, 
//then it means that the area (time) is increased, thus storedPermits are more costly than fresh permits, 
//thus the RateLimiter becomes slower after a period of underutilization.//如果我们选择一个位于该水平线下方的函数，
//这意味着我们减少了函数的面积，从而减少了时间；
//因此，经过一段时间的未充分利用后，RateLimiter会变得更快；
//另一方面，如果我们选择一个位于水平线上的函数，
//则意味着面积(时间)增加，因此存储的令牌比新鲜的令牌更具成本，
//因此RateLimiter在一段时间的未充分利用之后变得更慢；//Last, but not least: consider a RateLimiter with rate of 1 permit per second, 
//currently completely unused, and an expensive acquire(100) request comes. 
//It would be nonsensical to just wait for 100 seconds, and then start the actual task. 
//Why wait without doing anything? 
//A much better approach is to allow the request right away (as if it was an acquire(1) request instead), 
//and postpone subsequent requests as needed. 
//In this version, we allow starting the task immediately, and postpone by 100 seconds future requests, 
//thus we allow for work to get done in the meantime instead of waiting idly.//最后，考虑速率为每秒1个令牌的RateLimiter，当前完全未使用，并且来了一个acquire(100)请求；
//如果要等待100秒才开始返回100个令牌，那是没有意义的；为什么在等待的时候什么都不做呢？
//一个更好的方法是立即允许该acquire(100)请求，就好像它是一个acquire(1)请求一样，并根据需要推迟后续请求；
//在这个版本中，我们允许立即启动任务，并将未来的请求推迟100秒；
//因此，我们允许工作在此期间完成，而不是无所事事地等待；//This has important consequences: 
//it means that the RateLimiter doesn't remember the time of the last request, 
//but it remembers the (expected) time of the next request. 
//This also enables us to tell immediately (see tryAcquire(timeout)) whether a particular timeout is enough to 
//get us to the point of the next scheduling time, since we always maintain that. //And what we mean by "an unused RateLimiter" is also defined by that notion: 
//when we observe that the "expected arrival time of the next request" is actually in the past, 
//then the difference (now - past) is the amount of time that the RateLimiter was formally unused, 
//and it is that amount of time which we translate to storedPermits. 
//(We increase storedPermits with the amount of permits that would have been produced in that idle time). 
//So, if rate == 1 permit per second, and arrivals come exactly one second after the previous, 
//then storedPermits is never increased -- we would only increase it for arrivals later than the expected one second.//这有一个很重要的结论：
//这意味着RateLimiter不保存上一次请求的时间，但是它保存下一次请求期望到达的时间；
//这也使我们能够立即判断下次被调度的时间是否是超时；//我们所说的"一个未使用的RateLimiter"也是由这个概念定义的：
//当我们观察到"下一个请求的预期到达时间"实际上已经过去时，并假设下次请求期望到达的时间点是past, 现在的时间点是now,
//那么now - past这段时间表示着RateLimiter没有被使用，所以在这段空闲时间内我们会增加storedPermits的数量；
//(我们只会在这种空闲的时间内增加储存的令牌）
//注意：假设速率rate = 1，即每秒一个permit，并且请求刚好在上一次请求之后的1s到达，
//那么storedPermits是不会增加的，只有超过1s才会增加；

假设RateLimiter对于过去的请求不怎么记忆，只记忆上一次的请求。那么当RateLimiter在很长一段时间内未被使用且一个请求到达时，该如何处理？

由于RateLimiter之前是处于一个长时间未被使用的状态("过去未充分利用")，所以可能是由如下两种情况导致的：利用不足或溢出。

如果是利用不足导致的，即资源过剩，那么RateLimiter应该马上批准新到来的请求。如果是溢出导致的，即服务器因缓存失效等原因花更多时间处理请求，那么RateLimiter应该阻塞新到来的请求。

所以，为了处理RateLimiter在过去未充分利用时面对的情况，我们增加一个storedPermits变量来表示"过去未充分利用"。当不存在未充分利用时，storedPermits变量为零，并且可以增长到maxStoredPermits，以获得足够大的未充分利用。因此acquire(permits)得到的令牌由两部分组成：存储的令牌、新鲜的令牌。

在RateLimiter处于长时间未被使用的状态下：如果是利用不足导致的，那么应该让存储的令牌比新鲜的令牌发放得更快。如果是溢出导致的，那么应该让存储的令牌比新鲜的令牌发放得更慢。因此，我们需要一个函数将storedPermits转换为限流时间。

这个函数是一个将storedPermits映射到1/rate的连续函数。其中storedPermits主要衡量RateLimiter未被使用的时间，我们会在这段未被使用的时间内存储令牌。而rate(速率)就是申请的令牌 / 时间，即rate = permits / time。因此1 / rate = time / permits，表示每个令牌需要的时间。也就是说，这个连续函数上的积分，对应于随后的请求之间的最小时间间隔。

接下来假设这个函数是一条水平线：如果其高度恰好为1/QPS，则函数的效果不存在，因为此时表示以与新鲜的令牌完全相同的成本提供存储的令牌。其中，1/QPS是每个新鲜令牌的成本。

如果该水平线的高度位于1/QPS下方，则意味着减少了函数的面积，从而减少了时间。因此RateLimiter经过一段时间的未充分利用后，会变得更快。

如果该水平线的高度位于1/QPS上方，则意味着增加了函数的面积，从而增加了时间。因此存储的令牌比新鲜的令牌更具成本，RateLimiter经过一段时间的未充分利用后，会变得更慢。

注意：RateLimiter不保存上一次请求的时间，但是它保存下一次请求期望到达的时间。如果下一个请求的预期到达时间实际上已经过去了，并且假设下次请求期望到达的时间点是past，现在的时间点是now。那么now - past的这段时间表示RateLimiter没有被使用，所以在这段空闲时间内我们会增加storedPermits的数量。

(4)SmoothWarmingUp的注释说明

//This implements a "bursty" RateLimiter, where storedPermits are translated to zero throttling.
//The maximum number of permits that can be saved (when the RateLimiter is unused) is defined in terms of time, 
//in this sense: if a RateLimiter is 2qps, and this time is specified as 10 seconds, we can save up to 2 * 10 = 20 permits.
static final class SmoothBursty extends SmoothRateLimiter {......
}//This implements the following function where coldInterval = coldFactor * stableInterval.
//          ^ throttling
//          |
//    cold  +                  /
// interval |                 /.
//          |                / .
//          |               /  .   ← "warmup period" is the area of the trapezoid between thresholdPermits and maxPermits
//          |              /   .     "预热期"是指thresholdPermits和maxPermits之间的梯形区域
//          |             /    .
//          |            /     .
//          |           /      .
//   stable +----------/  WARM .
// interval |          .   UP  .
//          |          . PERIOD.
//          |          .       .
//        0 +----------+-------+--------------→ storedPermits
//          0 thresholdPermits maxPermits//Before going into the details of this particular function, let's keep in mind the basics:
//The state of the RateLimiter (storedPermits) is a vertical line in this figure.
//When the RateLimiter is not used, this goes right (up to maxPermits).
//When the RateLimiter is used, this goes left (down to zero), since if we have storedPermits, we serve from those first.
//When unused, we go right at a constant rate! The rate at which we move to the right is chosen as maxPermits / warmupPeriod. 
//This ensures that the time it takes to go from 0 to maxPermits is equal to warmupPeriod.
//When used, the time it takes, as explained in the introductory class note, 
//is equal to the integral of our function, between X permits and X-K permits, 
//assuming we want to spend K saved permits.//在深入了解这个特定函数的细节之前，让我们记住以下基本内容：
//横坐标是RateLimiter的状态，表示storedPermits的值；
//当RateLimiter没有被使用时，横坐标向右移动直到maxPermits；
//当RateLimiter被使用时，横坐标开始向左移动直到0，storedPermits有值，会优先使用它；
//当未被使用时，坐标以恒定的速率向右移动，这个速率被选为：maxPermits / warmupPeriod；
//这样可以保证横坐标从0到maxPermits花费的时间等于warmupPeriod；
//当被使用时，花费的时间就是花费K个permits宽度之间的积分；
//(纵坐标就是1 / rate = time / permits，每个令牌的时间)//In summary, the time it takes to move to the left (spend K permits), 
//is equal to the area of the function of width == K.//Assuming we have saturated demand, the time to go from maxPermits to thresholdPermits is equal to warmupPeriod. 
//And the time to go from thresholdPermits to 0 is warmupPeriod/2. 
//(The reason that this is warmupPeriod/2 is to maintain the behavior of the original implementation where coldFactor was hard coded as 3.)//假设我们的令牌桶满了，maxPermits到thresholdPermits花费的时间等于warmupPeriod；
//从thresholdPermits到0花费的时间是warmupPeriod / 2；//It remains to calculate thresholdsPermits and maxPermits.
//The time to go from thresholdPermits to 0 is equal to the integral of the function between 0 and thresholdPermits. 
//This is thresholdPermits * stableIntervals. By (5) it is also equal to warmupPeriod/2. 
//Therefore thresholdPermits = 0.5 * warmupPeriod / stableInterval
//The time to go from maxPermits to thresholdPermits is equal to the integral of the function between thresholdPermits and maxPermits. 
//This is the area of the pictured trapezoid, and it is equal to 0.5 * (stableInterval + coldInterval) * (maxPermits - thresholdPermits). 
//It is also equal to warmupPeriod, so maxPermits = thresholdPermits + 2 * warmupPeriod / (stableInterval + coldInterval)//从thresholdPermits到0花费的时间，是从0到thresholdPermits之间的函数积分；
//等于thresholdPermits * stableIntervals，按照上面讲的它也等于warmupPeriod / 2；
//所以thresholdPermits = 0.5 * warmupPeriod / stableInterval；
//从maxPermits到thresholdPermits花费的时间，等于上图梯形的面积：
//0.5 * (stableInterval + coldInterval) * (maxPermits - thresholdPermits) = warmupPeriod
//所以maxPermits = thresholdPermits + 2 * warmupPeriod / (stableInterval + coldInterval)static final class SmoothWarmingUp extends SmoothRateLimiter {......
}

说明一：RateLimiter是令牌桶算法的具体实现

所以其获取令牌和放入令牌的方法可配合漏桶算法的流程去理解。

说明二：预热模式图中涉及的变量

其中横轴表示令牌桶中的令牌数量，纵轴表示生成令牌的时间间隔。令牌的消费是从右往左进行的。当限流器RateLimiter未被使用时，即空闲时，会生成令牌放入桶中。

SmoothBursty生成令牌的速率是固定的：

stableInterval = 1 / permitsPerSecond；

SmoothWarmingUp生成令牌的速率则会随storedPermits变化而变化：

当storedPermits < thresholdPermits时，速率为stableInterval；
当storedPermits > thresholdPermits时，速率为变化的coldInterval；

变量一：stableIntervalMicros

表示系统预热完成后，生成令牌的时间间隔。若QPS限制为100，则说明每10ms生成一个令牌。

stableIntervalMicros = 1 / permitsPerSecond

变量二：coldIntervalMicros

表示系统水位最低时，生成令牌的时间间隔，与coldFactor有关。

变量三：coldFactor

冷却因子，表示倍数，即coldInterval是stableInterval的多少倍。

变量四：thresholdPermits

表示进入预热阶段的临界值。当令牌桶中的令牌数量减少到临界值时，系统预热结束。当令牌桶中的令牌数量大于临界值时，系统进入冷启动模式。

变量五：maxPermits

表示令牌桶的容量。当令牌桶中的令牌数达到最大容量时，生成的令牌将被抛弃。

变量六：slope

表示斜率。用于计算当前令牌生成时的时间间隔，从而计算当前每秒能生成多少令牌。

变量七：warmupPeriod

表示系统预热时间，即梯形的面积。在预热模型图中，梯形的面积 = (coldFactor - 1) * 长方形面积。

根据梯形面积的计算公式：知道warmupPeriodMicros和permitsPerSecond，就可以计算maxPermits和thresholdPermits。也就是thresholdPermits = 0.5 * warmupPeriodMicros / stableInterval，所以maxPermits = warmupPeriodMicros / stableInterval。

stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros;
maxPermits = thresholdPermits + 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);

查看全文

http://www.dtcms.com/a/142779.html